Skip to content

Commit

Permalink
Merge pull request #1 from demokratie-live/develop
Browse files Browse the repository at this point in the history
NPM Version 1.0.2
  • Loading branch information
ulfgebhardt authored Jan 18, 2019
2 parents b60ff4b + 751db8d commit 243bf93
Show file tree
Hide file tree
Showing 7 changed files with 259 additions and 6 deletions.
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
node_modules/
dist/
*.js
*.d.ts
*.svg
*.d.ts
25 changes: 24 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,24 @@
# scapacra
# scapacra
## Introduction
Scapactra (scraper, parser and crawler) is a framework to extract data from different data sources.
The idea for scapactra bases on the ETL (extract, transform and load) process ([ETL](https://de.wikipedia.org/wiki/ETL-Prozess "ETL (extract, transform and load)")) and defines an modular design pattern providing a basic ETL workflow.

The framework is structured into three basic modules.
1. **Parser**:
The parser extracts the data from a defined document.
2. **Browser**:
The browser navigates through a structure and retrieves the desired fragments for the parser.
3. **Scraper**:
A scraper executes the browsers an parsers and providing their results over an centralized interface.

## Parser

![Parser](out/doc/Parser/Parser.svg "UML of parser")

## Browser

![Browser](out/doc/Browser/Browser.svg "UML of browser")

## Scraper

![Scraper](out/doc/Scraper/Scraper.svg "UML of scraper")
35 changes: 35 additions & 0 deletions out/doc/Browser/Browser.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
103 changes: 103 additions & 0 deletions out/doc/Importer/Importer.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 243bf93

Please sign in to comment.