- Add
useIncognitoPages
option toPuppeteerPool
to enable opening new pages in incognito browser contexts. This is useful to keep cookies and cache unique for each page.
- This release updates
@apify/http-request
to version 1.1.2. - Update
CheerioCrawler
to userequestAsBrowser()
to better disguise as a real browser.
- This release just updates some dependencies (not Puppeteer).
- DEPRECATED:
dataset.delete()
,keyValueStore.delete()
andrequestQueue.delete()
methods have been deprecated in favor of*.drop()
methods, because thedrop
name more clearly communicates the fact that those methods drop / delete the storage itself, not individual elements in the storage. - Added
Apify.utils.requestAsBrowser()
helper function that enables you to make HTTP(S) requests disguising as a browser (Firefox). This may help in overcoming certain anti-scraping and anti-bot protections. - Added
options.gotoTimeoutSecs
toPuppeteerCrawler
to enable easier setting of navigation timeouts. PuppeteerPool
options that were deprecated from thePuppeteerCrawler
constructor were finally removed. Please usemaxOpenPagesPerInstance
,retireInstanceAfterRequestCount
,instanceKillerIntervalSecs
,killInstanceAfterSecs
andproxyUrls
via thepuppeteerPoolOptions
object.- On the Apify Platform a warning will now be printed when using an outdated
apify
package version. Apify.utils.puppeteer.enqueueLinksByClickingElements()
will now print a warning when the nodes it tries to click become modified (detached from DOM). This is useful to debug unexpected behavior.
Apify.launchPuppeteer()
now acceptsproxyUrl
with thehttps
,socks4
andsocks5
schemes, as long as it doesn't contain username or password. This is to fix Issue #420.- Added
desiredConcurrency
option toAutoscaledPool
constructor, removed unnecessary bound check from the setter property
- Fix error where Puppeteer would fail to launch when pipes are turned off.
- Switch back to default Web Socket transport for Puppeteer due to upstream issues.
- BREAKING CHANGE Removed support for Web Driver (Selenium) since no further updates are planned. If you wish to continue using Web Driver, please stay on Apify SDK version ^0.14.15
- BREAKING CHANGE:
Dataset.getData()
throws an error if user provides an unsupported option when using local disk storage. - DEPRECATED:
options.userData
ofApify.utils.enqueueLinks()
is deprecated. Useoptions.transformRequestFunction
instead. - Improve logging of memory overload errors.
- Improve error message in
Apify.call()
. - Fix multiple log lines appearing when a crawler was about to finish.
- Add
Apify.utils.puppeteer.enqueueLinksByClickingElements()
function which enables you to add requests to the queue from pure JavaScript navigations, form submissions etc. - Add
Apify.utils.puppeteer.infiniteScroll()
function which helps you with scrolling to the bottom of websites that auto-load new content. - The
RequestQueue.handledCount()
function has been resurrected from deprecation, in order to have compatible interface withRequestList
. - Add
useExtendedUniqueKey
option toRequest
constructor to includemethod
andpayload
in theRequest
's computeduniqueKey
. - Updated Puppeteer to 1.18.1
- Updated
apify-client
to 0.5.22
- Fixes in
RequestQueue
to deal with inconsistencies in the underlying data storage - BREAKING CHANGE:
RequestQueue.addRequest()
now sets the ID of the newly added request to the passedRequest
object - The
RequestQueue.handledCount()
function has been deprecated, please useRequestQueue.getInfo()
instead.
- Fix error where live view would crash when started with concurrency already higher than 1.
- Fix
POST
requests in Puppeteer.
Snapshotter
will now log critical memory overload warnings at most once per 10 seconds.- Live view snapshots are now made right after navigation finishes, instead of right before page close.
- Add
Statistics
class to track crawler run statistics. - Use pipes instead of web sockets in Puppeteer to improve performance and stability.
- Add warnings to all functions using Puppeteer's request interception to inform users about its performance impact caused by automatic cache disabling.
- DEPRECATED:
Apify.utils.puppeteer.blockResources()
because of negative impact on performance. Use.blockRequests()
(see below). - Add
Apify.utils.puppeteer.blockRequests()
to enable blocking URL patterns without request interception involved. This is a replacement for.blockResources()
until performance issues with request interception resolve.
- Update
Puppeteer
to 1.17.0. - Add
idempotencyKey
parameter toApify.addWebhook()
.
- Better logs from
AutoscaledPool
class - Replace
cpuInfo
Apify event with newsystemInfo
event inSnapshotter
.
- Bump
apify-client
to 0.5.17
- Bump
apify-client
to 0.5.16
- Stringification to JSON of actor input in
Apify.call()
,Apify.callTask()
andApify.metamorph()
now also supports functions viafunc.toString()
. The same holds for record body insetValue()
method of key-value store. - Request queue now monitors number of clients that accessed the queue which allows crawlers to finish without 10s waiting if run was not migrated during its lifetime.
- Update Puppeteer to 1.15.0.
- Added the
stealth
optionlaunchPuppeteerOptions
which decreases headless browser detection chance. - DEPRECATED:
Apify.utils.puppeteer.hideWebDriver
uselaunchPuppeteerOptions.stealth
instead. CheerioCrawler
now parses HTML using streams. This improves performance and memory usage in most cases.
- Request queue now allows crawlers to finish quickly without waiting in a case that queue was used by a single client.
- Better logging of errors in
Apify.main()
- Fix invalid type check in
puppeteerModule
.
- Made UI and UX improvements to
LiveViewServer
functionality. launchPuppeteerOptions.puppeteerModule
now supportsObject
(pre-required modules).- Removed
--enable-resource-load-scheduler=false
Chromium command line flag, it has no effect. See https://bugs.chromium.org/p/chromium/issues/detail?id=723233 - Fixed inconsistency in
prepareRequestFunction
ofCheerioCrawler
. - Update Puppeteer to 1.14.0
- BREAKING CHANGE: Live View is no longer available by passing
liveView = true
tolaunchPuppeteerOptions
. - New version of Live View is available by passing the
useLiveView = true
option toPuppeteerPool
.- Only shows snapshots of a single page from a single browser.
- Only makes snapshots when a client is connected, having very low performance impact otherwise.
- Added
Apify.utils.puppeteer.addInterceptRequestHandler
andremoveInterceptRequestHandler
which can be used to add multiple request interception handlers to Puppeteer's pages. - Added
puppeteerModule
toLaunchPuppeteerOptions
which enables use of other Puppeteer modules, such aspuppeteer-extra
instead of plainpuppeteer
.
- Fix a bug where invalid response from
RequestQueue
would occasionally cause crawlers to crash.
- Fix
RequestQueue
throttling at high concurrency.
- Fix bug in
addWebhook
invocation.
- Fix
puppeteerPoolOptions
object not being used inPuppeteerCrawler
.
- Fix
REQUEST_QUEUE_HEAD_MAX_LIMIT
is not defined error.
Snapshotter
now marks Apify Client overloaded on the basis of 2nd retry errors.- Added
Apify.addWebhook()
to invoke a webhook when an actor run ends. Currently this only works on the Apify Platform and will print a warning when ran locally.
- BREAKING CHANGE: Added
puppeteerOperationTimeoutSecs
option toPuppeteerPool
. It defaults to 15 seconds and all Puppeteer operations such asbrowser.newPage()
orpuppeteer.launch()
will now time out. This is to prevent hanging requests. - BREAKING CHANGE: Added
handleRequestTimeoutSecs
option toBasicCrawler
with a 60 second default. - DEPRECATED:
PuppeteerPool
options in thePuppeteerCrawler
constructor are now deprecated. Please use the newpuppeteerPoolOptions
argument of typeObject
to pass them.launchPuppeteerFunction
andlaunchPuppeteerOptions
are still available as shortcuts for convenience. CheerioCrawler
andPuppeteerCrawler
now automatically sethandleRequestTimeoutSecs
to 10 times theirhandlePageTimeoutSecs
. This is a precaution that should keep requests from hanging forever.- Added
options.prepareRequestFunction()
toCheerioCrawler
constructor to enable modification ofRequest
before the HTTP request is made to the target URL. - Added back the
recycleDiskCache
option toPuppeteerPool
now that it is supported even in headless mode (read more)
- Parameters
input
andoptions
added toApify.callTask()
.
- Added oldest active tab focusing to
PuppeteerPool
to combat resource throttling in Chromium.
- Added
Apify.metamorph()
, see documentation for more information. - Added
Apify.getInput()
- BREAKING CHANGE: Reduced default
handlePageTimeoutSecs
for bothCheerioCrawler
andPuppeteerCrawler
from 300 to 60 seconds, in order to prevent stalling crawlers. - BREAKING CHANGE:
PseudoUrl
now performs case-insensitive matching, even for the query string part of the URLs. If you need case sensitive matching, use an appropriateRegExp
in place of a Pseudo URL string - Upgraded to puppeteer@1.12.2 and xregexp@4.2.4
- Added
loadedUrl
property toRequest
that contains the final URL of the loaded page after all redirects. - Added memory overload warning log message.
- Added
keyValueStore.getPublicUrl
function. - Added
minConcurrency
,maxConcurrency
,desiredConcurrency
andcurrentConcurrency
properties toAutoscaledPool
, improved docs - Deprecated
AutoscaledPool.setMinConcurrency
andAutoscaledPool.setMaxConcurrency
functions - Updated
DEFAULT_USER_AGENT
andUSER_AGENT_LIST
with new User Agents - Bugfix:
LocalRequestQueue.getRequest()
threw an exception if request was not found - Added
RequestQueue.getInfo()
function - Improved
Apify.main()
to provide nicer stack traces on errors Apify.utils.puppeteer.injectFile()
now supports injection that survives page navigations and caches file contents.
- Fix the
keyValueStore.forEachKey()
method. - Fix version of
puppeteer
to prevent errors with automatic updates.
- Apify SDK now logs basic system info when
required
. - Added
utils.createRequestDebugInfo()
function to create a standardized debug info from request and response. PseudoUrl
can now be constructed with aRegExp
.Apify.utils.enqueueLinks()
now acceptsRegExp
instances in itspseudoUrls
parameter.Apify.utils.enqueueLinks()
now accepts abaseUrl
option that enables resolution of relative URLs when parsing a Cheerio object. (It's done automatically in browser when using Puppeteer).- Better error message for an invalid
launchPuppeteerFunction
passed toPuppeteerPool
.
- DEPRECATION WARNING
Apify.utils.puppeteer.enqueueLinks()
was moved toApify.utils.enqueueLinks()
. Apify.utils.enqueueLinks()
now supportsoptions.$
property to enqueue links from a Cheerio object.
- Disabled the
PuppeteerPool
reusePages
option for now, due to a memory leak. - Added a
keyValueStore.forEachKey()
method to iterate all keys in the store.
- Improvements in
Apify.utils.social.parseHandlesFromHtml
andApify.utils.htmlToText
- Updated docs
- Fix
reusePages
causing Puppeteer to fail when used together with request interception.
- Fix missing
reusePages
configuration parameter inPuppeteerCrawler
. - Fix a memory leak where
reusePages
would prevent browsers from closing.
- Fix missing
autoscaledPool
parameter inhandlePageFunction
ofPuppeteerCrawler
.
-
BREAKING CHANGE:
basicCrawler.abort()
,cheerioCrawler.abort()
andpuppeteerCrawler.abort()
functions were removed in favor of a singleautoscaledPool.abort()
function. -
Added a reference to the running
AutoscaledPool
instance to the options object ofBasicCrawler
'shandleRequestFunction
and to thehandlePageFunction
ofCheerioCrawler
andPuppeteerCrawler
. -
Added sources persistence option to
RequestList
that works best in conjunction with the state persistence, but can be toggled separately too. -
Added
Apify.openRequestList()
function to place it in line withRequestQueue
,KeyValueStore
andDataset
.RequestList
created using this function will automatically persist state and sources. -
Added
pool.pause()
andpool.resume()
functions toAutoscaledPool
. You can now pause the pool, which will prevent additional tasks from being run and wait for the running ones to finish. -
Fixed a memory leak in
CheerioCrawler
and potentially other crawlers.
- Added
Apify.utils.htmlToText()
function to convert HTML to text and removed unncessaryhtml-to-text
dependency. The new function is now used inApify.utils.social.parseHandlesFromHtml()
. - Updated
DEFAULT_USER_AGENT
autoscaledPool.isFinishedFunction()
andautoscaledPool.isTaskReadyFunction()
exceptions will now cause thePromise
returned byautoscaledPool.run()
to reject instead of just logging a message. This is in line with theautoscaledPool.runTaskFunction()
behavior.- Bugfix: PuppeteerPool was incorrectly overriding
proxyUrls
even if they were not defined. - Fixed an issue where an error would be thrown when
datasetLocal.getData()
was invoked with an overflowing offset. It now correctly returns an emptyArray
. - Added the
reusePages
option toPuppeteerPool
. It will now reuse existing tabs instead of opening new ones for each page when enabled. BasicCrawler
(and therefore all Crawlers) now logs a message explaining why it finished.- Fixed an issue where
maxRequestsPerCrawl
option would not be honored after restart or migration. - Fixed an issue with timeout promises that would sometimes keep the process hanging.
CheerioCrawler
now acceptsgzip
anddeflate
compressed responses.
- Upgraded Puppeteer to 1.11.0
- DEPRECATION WARNING:
Apify.utils.puppeteer.enqueueLinks()
now uses an options object instead of individual parameters and supports passing ofuserData
to the enqueuedrequest
. Previously:enqueueLinks(page, selector, requestQueue, pseudoUrls)
Now:enqueueLinks({ page, selector, requestQueue, pseudoUrls, userData })
. Using individual parameters is DEPRECATED.
- Added API response tracking to AutoscaledPool, leveraging
Apify.client.stats
object. It now overloads the system when a large amount of 429 - Too Many Requests is received.
- Updated NPM packages to fix a vulnerability reported at dominictarr/event-stream#116
- Added warning if the Node.js is an older version that doesn't support regular expression syntax used by the tools in
the
Apify.utils.social
namespace, instead of failing to start.
- Added back support for
memory
option inApify.call()
, write deprecation warning instead of silently failing
- Improvements in
Apify.utils.social
functions and tests
- Added new
Apify.utils.social
namespace with function to extract emails, phone and social profile URLs from HTML and text documents. Specifically, it supports Twitter, LinkedIn, Instagram and Facebook profiles. - Updated NPM dependencies
Apify.launchPuppeteer()
now sets thedefaultViewport
option if not provided by user, to improve screenshots and debugging experience.- Bugfix:
Dataset.getInfo()
sometimes returned an object withitemsCount
field instead ofitemCount
- Improvements in deployment script.
- Bugfix:
Apify.call()
was causing permissions error.
- Automatically adding
--enable-resource-load-scheduler=false
Chrome flag inApify.launchPuppeteer()
to make crawling of pages in all tabs run equally fast.
- Bug fixes and improvements of internals.
- Package updates.
- Added the ability of
CheerioCrawler
to request and download onlytext/html
responses. - Added a workaround for a long standing
tunnel-agent
package error toCheerioCrawler
. - Added
request.doNotRetry()
function to prevent further retries of arequest
. - Deprecated
request.ignoreErrors
option. Userequest.doNotRetry
. - Fixed
Apify.utils.puppeteer.enqueueLinks
to allownull
value forpseudoUrls
param - Fixed
RequestQueue.addRequest()
to gracefully handle invalid URLs - Renamed
RequestOperationInfo
toQueueOperationInfo
- Added
request
field toQueueOperationInfo
- DEPRECATION WARNING: Parameter
timeoutSecs
ofApify.call()
is used for actor run timeout. For time of waiting for run to finish usewaitSecs
parameter. - DEPRECATION WARNING: Parameter
memory
ofApify.call()
was renamed tomemoryMbytes
. - Added
Apify.callTask()
that enables to start actor task and fetch its output. - Added option enforcing cloud storage to be used in
openKeyValueStore()
,openDataset()
andopenRequestQueue()
- Added
autoscaledPool.setMinConcurrency()
andautoscaledPool.setMinConcurrency()
- Fix a bug in
CheerioCrawler
whereuseApifyProxy
would only work withapifyProxyGroups
.
- Reworked
request.pushErrorMessage()
to support any message and not throw. - Added Apify Proxy (
useApifyProxy
) support toCheerioCrawler
. - Added custom
proxyUrls
support toPuppeteerPool
andCheerioCrawler
. - Added Actor UI
pseudoUrls
output support toApify.utils.puppeteer.enqueueLinks()
.
- Created dedicated project page at https://sdk.apify.com
- Improved docs, texts, guides and other texts, pointed links to new page
- Bugfix in
PuppeteerPool
: Pages were sometimes considered closed even though they weren't - Improvements in documentation
- Upgraded Puppeteer to 1.9.0
- Added
Apify.utils.puppeteer.cacheResponses
to enable response caching in headless Chromium.
- Fixed
AutoscaledPool
terminating before all tasks are finished. - Migrated to v 0.1.0 of
apify-shared
.
- Allow AutoscaledPool to run tasks up to minConcurrency even when the system is overloaded.
- Upgraded @apify/ps-tree depedency (fixes "Error: spawn ps ENFILE"), upgraded other NPM packages
- Updated documentation and README, consolidated images.
- Added CONTRIBUTING.md
- Updated documentation and README.
- Bugfixes in
RequestQueueLocal
- Updated documentation and README.
- Optimized autoscaled pool default configuration.
- BREAKING CHANGES IN AUTOSCALED POOL
- It has been completely rebuilt for better performance.
- It also now works locally.
- see Migration Guide for more information.
- Updated to apify-shared@0.0.58
- Bug fixes and documentation improvements.
- Upgraded Puppeteer to 1.8.0
- Upgraded NPM dependencies, fixed lint errors
Apify.main()
now sets theAPIFY_LOCAL_STORAGE_DIR
env var to a default value if neitherAPIFY_LOCAL_STORAGE_DIR
norAPIFY_TOKEN
is defined
- Updated
DEFAULT_USER_AGENT
andUSER_AGENT_LIST
- Added
recycleDiskCache
option toPuppeteerPool
to enable reuse of disk cache and thus speed up browsing - WARNING:
APIFY_LOCAL_EMULATION_DIR
environment variable was renamed toAPIFY_LOCAL_STORAGE_DIR
. - Environment variables
APIFY_DEFAULT_KEY_VALUE_STORE_ID
,APIFY_DEFAULT_REQUEST_QUEUE_ID
andAPIFY_DEFAULT_DATASET_ID
have now default valuedefault
so there is no need to define them when developing locally.
- Added
compileScript()
function toutils.puppeteer
to enable use of external scripts at runtime.
- Fixed persistent deprecation warning of
pageOpsTimeoutMillis
. - Moved
cheerio
to dependencies. - Fixed
keepDuplicateUrls
errors with persistent RequestList.
- Added
getInfo()
method to Dataset to get meta-information about a dataset. - Added CheerioCrawler, a specialized class for crawling the web using
cheerio
. - Added
keepDuplicateUrls
option to RequestList to allow duplicate URLs. - Added
.abort()
method to all Crawler classes to enable stopping the crawl programmatically. - Deprecated
pageOpsTimeoutMillis
option. UsehandlePageTimeoutSecs
. - Bluebird promises are being phased out of
apify
in favor ofasync-await
. - Added
log
toApify.utils
to improve logging experience.
- Replaced git-hosted version of our fork of ps-tree with @apify/ps-tree package
- Removed old unused Apify.readyFreddy() function
- Improved logging of URL and port in
PuppeteerLiveViewBrowser
. - PuppeteerCrawler's default page load timeout changed from 30 to 60 seconds.
- Added
Apify.utils.puppeteer.blockResources()
function - More efficient implementation of
getMemoryInfo
function - Puppeteer upgraded to 1.7.0
- Upgraded NPM dependencies
- Dropped support for Node 7
- Fixed unresponsive magnifying glass and improved status tracking in LiveView frontend
- Fixed invalid URL parsing in RequestList.
- Added support for non-Latin language characters (unicode) in URLs.
- Added validation of payload size and automatic chunking to
dataset.pushData()
. - Added support for all content types and their known extensions to
KeyValueStoreLocal
.
- Puppeteer upgraded to 1.6.0.
- Removed
pageCloseTimeoutMillis
option fromPuppeteerCrawler
since it only affects debug logging.
- Bug where failed
page.close()
inPuppeteerPool
was causing request to be retried is fixed. - Added
memory
parameter toApify.call()
. - Added
PuppeteerPool.retire(browser)
method allowing retire a browser before it reaches his limits. This is useful when its IP address got blocked by anti-scraping protection. - Added option
liveView: true
toApify.launchPuppeteer()
that will start a live view server proving web page with overview of all running Puppeteer instances and their screenshots. PuppeteerPool
now kills opened Chrome instances inSIGINT
signal.
- Bugfix in BasicCrawler: native Promise doesn't have finally() function
- Parameter
maxRequestsPerCrawl
added toBasicCrawler
andPuppeteerCrawler
classes.
- Revereted back -
Apify.getApifyProxyUrl()
accepts againsession
andgroups
options instead ofapifyProxySession
andapifyProxyGroups
- Parameter
memory
added toApify.call()
.
PseudoUrl
class can now contain a template forRequest
object creation andPseudoUrl.createRequest()
method.- Added
Apify.utils.puppeteer.enqueueLinks()
function which enqueues requests created from links mathing given pseudo-URLs.
- Added 30s timeout to
page.close()
operation inPuppeteerCrawler
.
- Added
dataset.detData()
,dataset.map()
,dataset.forEach()
anddataset.reduce()
functions. - Added
delete()
method toRequestQueue
,Dataset
andKeyValueStore
classes.
- Added
loggingIntervalMillis
options toAutoscaledPool
- Bugfix:
utils.isProduction
function was incorrect - Added
RequestList.length()
function
- Bugfix in
RequestList
- skip invalid in-progress entries when restoring state - Added
request.ignoreErrors
options. See documentation for more info.
- Bugfix in
Apify.utils.puppeteer.injectXxx
functions
- Puppeteer updated to v1.4.0
- Added
Apify.utils
andApify.utils.puppeteer
namespaces for various helper functions. - Autoscaling feature of
AutoscaledPool
,BasicCrawler
andPuppeteerCrawler
is disabled on Apify platform until all issues are resolved.
- Added
Apify.isAtHome()
function that returnstrue
when code is running on Apify platform andfalse
otherwise (for example locally). - Added
ignoreMainProcess
parameter toAutoscaledPool
. Check documentation for more info. pageOpsTimeoutMillis
ofPuppeteerCrawler
increased to 300 seconds.
- Parameters
session
andgroups
ofgetApifyProxyUrl()
renamed toapifyProxySession
andapifyProxyGroups
to match naming of the same parameters in other classes.
RequestQueue
now caches known requests and their state to beware of unneeded API calls.
- WARNING:
disableProxy
configuration ofPuppeteerCrawler
andPuppeteerPool
removed. By default no proxy is used. You must either use new configurationlaunchPuppeteerOptions.useApifyProxy = true
to use Apify Proxy or provide own proxy vialaunchPuppeteerOptions.proxyUrl
. - WARNING:
groups
parameter ofPuppeteerCrawler
andPuppeteerPool
removed. UselaunchPuppeteerOptions.apifyProxyGroups
instead. - WARNING:
session
andgroups
parameters ofApify.getApifyProxyUrl()
are now validated to contain only alphanumberic characters and underscores. Apify.call()
now throws anApifyCallError
error if run doesn't succeed- Renamed options
abortInstanceAfterRequestCount
ofPuppeteerPool
andPuppeteerCrawler
to retireInstanceAfterRequestCcount - Logs are now in plain text instead of JSON for better readability.
- WARNING:
AutoscaledPool
was completely redesigned. Check documentation for reference. It still supports previous configuration parameters for backwards compatibility but in the future compatibility will break. handleFailedRequestFunction
in bothBasicCrawler
andPuppeteerCrawler
has now also error object available inops.error
.- Request Queue storage type implemented. See documentation for more information.
BasicCrawler
andPuppeteerCrawler
now supports bothRequestList
andRequestQueue
.launchPuppeteer()
changesUser-Agent
only when in headless mode or if not using full Google Chrome, to reduce chance of detection of the crawler.- Apify package now supports Node 7 and newer.
AutoscaledPool
now scales down less aggresively.PuppeteerCrawler
andBasicCrawler
now allow its underlyingAutoscaledPool
functionisFunction
to be overriden.- New events
persistState
andmigrating
added. Check documentation ofApify.events
for more information. RequestList
has a new parameterpersistStateKey
. If this is used thenRequestList
persists its state in the default key-value store at regular intervals.- Improved
README.md
and/examples
directory.
- Added
useChrome
flag tolaunchPuppeteer()
function - Bugfixes in
RequestList
- Removed again the --disable-dev-shm-usage flag when launching headless Chrome, it might be causing issues with high IO overheads
- Upgraded Puppeteer to version 1.2.0
- Added
finishWhenEmpty
andmaybeRunPromiseIntervalMillis
options toAutoscaledPool
class. - Fixed false positive errors logged by
PuppeteerPool
class.
- Added back
--no-sandbox
to launch of Puppeteer to avoid issues on older kernels
- If the
APIFY_XVFB
env var is set to1
, then avoid headless mode and use Xvfb instead - Updated DEFAULT_USER_AGENT to Linux Chrome
- Consolidated startup options for Chrome - use
--disable-dev-shm-usage
, skip--no-sandbox
, use--disable-gpu
only on Windows - Updated docs and package description
- Puppeteer updated to
1.1.1
- A lot of new stuff. Everything is backwards compatible. Check https://apify.com/docs/sdk/apify-runtime-js/latest for reference
Apify.setPromiseDependency()
/Apify.getPromiseDependency()
/Apify.getPromisePrototype()
removed- Bunch of classes as
AutoscaledPool
orPuppeteerCrawler
added, check documentation
- Renamed GitHub repo
- Changed links to Travis CI
- Changed links to Apify GitHub repo
Apify.pushData()
added
- Upgraded puppeteer optional dependency to version
^1.0.0
- Initial development, lot of new stuff