Skip to content

Commit

Permalink
other
Browse files Browse the repository at this point in the history
  • Loading branch information
coder-hxl committed Feb 17, 2023
1 parent d2409e6 commit f843c9f
Show file tree
Hide file tree
Showing 7 changed files with 160 additions and 77 deletions.
55 changes: 44 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ XCrawl is a Nodejs multifunctional crawler library.

- Crawl HTML, JSON, file resources, etc. with simple configuration
- Use the JSDOM library to parse HTML, or parse HTML by yourself
- Optional mode asynchronous/synchronous for batch requests
- The request method supports asynchronous/synchronous
- Support Promise/Callback
- Polling function
- Anthropomorphic request interval
- Written in TypeScript
Expand Down Expand Up @@ -47,6 +48,7 @@ XCrawl is a Nodejs multifunctional crawler library.
* [IFetchFileConfig](#IFetchFileConfig)
* [IFetchPollingConfig](#IFetchPollingConfig)
* [IFetchCommon](#IFetchCommon)
* [IFetchCommonArr](#IFetchCommonArr)
* [IFileInfo](#IFileInfo)
* [IFetchHTML](#IFetchHTML)
- [More](#More)
Expand Down Expand Up @@ -92,10 +94,26 @@ Create a crawler instance via new XCrawl. The request queue is maintained by the
```ts
class XCrawl {
constructor(baseConfig?: IXCrawlBaseConifg)
fetchHTML(config: IFetchHTMLConfig): Promise<IFetchHTML>
fetchData<T = any>(config: IFetchDataConfig): Promise<IFetchCommon<T>>
fetchFile(config: IFetchFileConfig): Promise<IFetchCommon<IFileInfo>>
fetchPolling(config: IFetchPollingConfig, callback: (count: number) => void): void

fetchHTML(
config: IFetchHTMLConfig,
callback?: (res: IFetchHTML) => void
): Promise<IFetchHTML>

fetchData<T = any>(
config: IFetchDataConfig,
callback?: (res: IFetchCommon<T>) => void
): Promise<IFetchCommonArr<T>>

fetchFile(
config: IFetchFileConfig,
callback?: (res: IFetchCommon<IFileInfo>) => void
): Promise<IFetchCommonArr<IFileInfo>>

fetchPolling(
config: IFetchPollingConfig,
callback: (count: number) => void
): void
}
```
Expand Down Expand Up @@ -142,7 +160,10 @@ fetchHTML is the method of the above [myXCrawl](https://github.com/coder-hxl/x-c
#### Type
```ts
function fetchHTML(config: IFetchHTMLConfig): Promise<IFetchHTML>
fetchHTML(
config: IFetchHTMLConfig,
callback?: (res: IFetchHTML) => void
): Promise<IFetchHTML>
```
#### Example
Expand All @@ -161,7 +182,10 @@ fetchData is the method of the above [myXCrawl](#Example-1) instance, which is u
#### Type
```ts
function fetchData<T = any>(config: IFetchDataConfig): Promise<IFetchCommon<T>>
fetchData<T = any>(
config: IFetchDataConfig,
callback?: (res: IFetchCommon<T>) => void
): Promise<IFetchCommonArr<T>>
```
#### Example
Expand All @@ -188,7 +212,10 @@ fetchFile is the method of the above [myXCrawl](#Example-1) instance, which is u
#### Type
```ts
function fetchFile(config: IFetchFileConfig): Promise<IFetchCommon<IFileInfo>>
fetchFile(
config: IFetchFileConfig,
callback?: (res: IFetchCommon<IFileInfo>) => void
): Promise<IFetchCommonArr<IFileInfo>>
```
#### Example
Expand Down Expand Up @@ -331,12 +358,18 @@ interface IFetchPollingConfig {
### IFetchCommon
```ts
type IFetchCommon<T> = {
interface IFetchCommon<T> {
id: number
statusCode: number | undefined
headers: IncomingHttpHeaders // node:http type
headers: IncomingHttpHeaders // node:http 类型
data: T
}[]
}
```
### IFetchCommonArr
```ts
type IFetchCommonArr<T> = IFetchCommon<T>[]
```
### IFileInfo
Expand Down
53 changes: 43 additions & 10 deletions document/cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ XCrawl 是 Nodejs 多功能爬虫库。

- 只需简单的配置即可抓取 HTML 、JSON、文件资源等等
- 使用 JSDOM 库对 HTML 解析,也可自行解析 HTML
- 批量请求时可选择模式 异步/同步
- 请求方式支持 异步/同步
- 支持 Promise/Callback
- 轮询功能
- 拟人化的请求间隔时间
- 使用 TypeScript 编写
Expand Down Expand Up @@ -47,6 +48,7 @@ XCrawl 是 Nodejs 多功能爬虫库。
* [IFetchFileConfig](#IFetchFileConfig)
* [IFetchPollingConfig](#IFetchPollingConfig)
* [IFetchCommon](#IFetchCommon)
* [IFetchCommonArr](#IFetchCommonArr)
* [IFileInfo](#IFileInfo)
* [IFetchHTML](#IFetchHTML)
- [更多](#更多)
Expand Down Expand Up @@ -104,10 +106,26 @@ myXCrawl.fetchPolling({ d: 1 }, () => {
```ts
class XCrawl {
constructor(baseConfig?: IXCrawlBaseConifg)
fetchHTML(config: IFetchHTMLConfig): Promise<IFetchHTML>
fetchData<T = any>(config: IFetchDataConfig): Promise<IFetchCommon<T>>
fetchFile(config: IFetchFileConfig): Promise<IFetchCommon<IFileInfo>>
fetchPolling(config: IFetchPollingConfig, callback: (count: number) => void): void
fetchHTML(
config: IFetchHTMLConfig,
callback?: (res: IFetchHTML) => void
): Promise<IFetchHTML>
fetchData<T = any>(
config: IFetchDataConfig,
callback?: (res: IFetchCommon<T>) => void
): Promise<IFetchCommonArr<T>>
fetchFile(
config: IFetchFileConfig,
callback?: (res: IFetchCommon<IFileInfo>) => void
): Promise<IFetchCommonArr<IFileInfo>>
fetchPolling(
config: IFetchPollingConfig,
callback: (count: number) => void
): void
}
```
Expand Down Expand Up @@ -154,7 +172,10 @@ fetchHTML 是 [myXCrawl](https://github.com/coder-hxl/x-crawl/blob/main/document
#### 类型
```ts
function fetchHTML(config: IFetchHTMLConfig): Promise<IFetchHTML>
fetchHTML(
config: IFetchHTMLConfig,
callback?: (res: IFetchHTML) => void
): Promise<IFetchHTML>
```
#### 示例
Expand All @@ -173,7 +194,10 @@ fetch 是 [myXCrawl](#示例-1) 实例的方法,通常用于爬取 API ,可
#### 类型
```ts
function fetchData<T = any>(config: IFetchDataConfig): Promise<IFetchCommon<T>>
fetchData<T = any>(
config: IFetchDataConfig,
callback?: (res: IFetchCommon<T>) => void
): Promise<IFetchCommonArr<T>>
```
#### 示例
Expand All @@ -200,7 +224,10 @@ fetchFile 是 [myXCrawl](#示例-1) 实例的方法,通常用于爬取文件
#### 类型
```ts
function fetchFile(config: IFetchFileConfig): Promise<IFetchCommon<IFileInfo>>
fetchFile(
config: IFetchFileConfig,
callback?: (res: IFetchCommon<IFileInfo>) => void
): Promise<IFetchCommonArr<IFileInfo>>
```
#### 示例
Expand Down Expand Up @@ -343,12 +370,18 @@ interface IFetchPollingConfig {
### IFetchCommon
```ts
type IFetchCommon<T> = {
interface IFetchCommon<T> {
id: number
statusCode: number | undefined
headers: IncomingHttpHeaders // node:http 类型
data: T
}[]
}
```
### IFetchCommonArr
```ts
type IFetchCommonArr<T> = IFetchCommon<T>[]
```
### IFileInfo
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"private": true,
"name": "x-crawl",
"version": "0.4.0",
"version": "1.0.0",
"author": "CoderHxl",
"description": "XCrawl is a Nodejs multifunctional crawler library.",
"license": "MIT",
Expand Down
55 changes: 44 additions & 11 deletions publish/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ XCrawl is a Nodejs multifunctional crawler library.

- Crawl HTML, JSON, file resources, etc. with simple configuration
- Use the JSDOM library to parse HTML, or parse HTML by yourself
- Optional mode asynchronous/synchronous for batch requests
- The request method supports asynchronous/synchronous
- Support Promise/Callback
- Polling function
- Anthropomorphic request interval
- Written in TypeScript
Expand Down Expand Up @@ -47,6 +48,7 @@ XCrawl is a Nodejs multifunctional crawler library.
* [IFetchFileConfig](#IFetchFileConfig)
* [IFetchPollingConfig](#IFetchPollingConfig)
* [IFetchCommon](#IFetchCommon)
* [IFetchCommonArr](#IFetchCommonArr)
* [IFileInfo](#IFileInfo)
* [IFetchHTML](#IFetchHTML)
- [More](#More)
Expand Down Expand Up @@ -92,10 +94,26 @@ Create a crawler instance via new XCrawl. The request queue is maintained by the
```ts
class XCrawl {
constructor(baseConfig?: IXCrawlBaseConifg)
fetchHTML(config: IFetchHTMLConfig): Promise<IFetchHTML>
fetchData<T = any>(config: IFetchDataConfig): Promise<IFetchCommon<T>>
fetchFile(config: IFetchFileConfig): Promise<IFetchCommon<IFileInfo>>
fetchPolling(config: IFetchPollingConfig, callback: (count: number) => void): void

fetchHTML(
config: IFetchHTMLConfig,
callback?: (res: IFetchHTML) => void
): Promise<IFetchHTML>

fetchData<T = any>(
config: IFetchDataConfig,
callback?: (res: IFetchCommon<T>) => void
): Promise<IFetchCommonArr<T>>

fetchFile(
config: IFetchFileConfig,
callback?: (res: IFetchCommon<IFileInfo>) => void
): Promise<IFetchCommonArr<IFileInfo>>

fetchPolling(
config: IFetchPollingConfig,
callback: (count: number) => void
): void
}
```
Expand Down Expand Up @@ -142,7 +160,10 @@ fetchHTML is the method of the above [myXCrawl](https://github.com/coder-hxl/x-c
#### Type
```ts
function fetchHTML(config: IFetchHTMLConfig): Promise<IFetchHTML>
fetchHTML(
config: IFetchHTMLConfig,
callback?: (res: IFetchHTML) => void
): Promise<IFetchHTML>
```
#### Example
Expand All @@ -161,7 +182,10 @@ fetchData is the method of the above [myXCrawl](#Example-1) instance, which is u
#### Type
```ts
function fetchData<T = any>(config: IFetchDataConfig): Promise<IFetchCommon<T>>
fetchData<T = any>(
config: IFetchDataConfig,
callback?: (res: IFetchCommon<T>) => void
): Promise<IFetchCommonArr<T>>
```
#### Example
Expand All @@ -188,7 +212,10 @@ fetchFile is the method of the above [myXCrawl](#Example-1) instance, which is u
#### Type
```ts
function fetchFile(config: IFetchFileConfig): Promise<IFetchCommon<IFileInfo>>
fetchFile(
config: IFetchFileConfig,
callback?: (res: IFetchCommon<IFileInfo>) => void
): Promise<IFetchCommonArr<IFileInfo>>
```
#### Example
Expand Down Expand Up @@ -331,12 +358,18 @@ interface IFetchPollingConfig {
### IFetchCommon
```ts
type IFetchCommon<T> = {
interface IFetchCommon<T> {
id: number
statusCode: number | undefined
headers: IncomingHttpHeaders // node:http type
headers: IncomingHttpHeaders // node:http 类型
data: T
}[]
}
```
### IFetchCommonArr
```ts
type IFetchCommonArr<T> = IFetchCommon<T>[]
```
### IFileInfo
Expand Down
2 changes: 1 addition & 1 deletion publish/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "x-crawl",
"version": "0.4.0",
"version": "1.0.0",
"author": "CoderHxl",
"description": "XCrawl is a Nodejs multifunctional crawler library.",
"license": "MIT",
Expand Down
2 changes: 1 addition & 1 deletion test/start/index.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit f843c9f

Please sign in to comment.