No longer maintained
I'd really recommend using Puppeteer, which solves the problem I was trying to solve here and is also really nice and quick.
A high level promise based wrapper for PhantomJS
The aim is to make interacting with PhantomJS from node as simple as possible. All actions are asynchronous and return a bluebird promise. The promises have been extended with Phantasma methods, allowing for a fluent API.
This project is heavily influenced by Nightmare, but different - Nightmare queues up actions which are then exectued when .run()
is called, once this is done phantomjs is exited. This is fine if you already know the actions you want to take however it's not possible to change the flow of actions mid-way e.g. if sometimes a popup/button appears on a page that you want to click before continuing with the next action. Phantasma takes a different approach - using promises which leaves queueing up to the promise library (bluebird) and leaves you in control of when to exit the phantomjs process.
-
Install PhantomJs: http://phantomjs.org/download.html
-
npm install phantasma
var Phantasma = require('phantasma');
var ph = new Phantasma();
ph.open('https://duckduckgo.com')
.type('#search_form_input_homepage', 'phantomjs')
.click('#search_button_homepage')
.wait()
.screenshot('screenshot.png')
.evaluate(function () {
return document.querySelectorAll('.result').length;
})
.then(function (num) {
console.log(num + ' results');
})
.catch(function (e) {
console.log('error', e);
})
.finally(function () {
console.log('done!');
ph.exit();
});
Any of the above methods can be replaced with a .then
e.g.
var Phantasma = require('phantasma');
var ph = new Phantasma();
ph.then(function () {
return ph.open('https://duckduckgo.com');
})
.screenshot('screenshot.png')
.finally(function () {
ph.exit();
});
This allows for conditionally changing the flow depending on the result of the last request:
var ph = new Phantasma();
ph.open('https://duckduckgo.com')
.type('#search_form_input_homepage', 'akjsdhjashda')
.click('#search_button_homepage')
.wait()
.evaluate(function () {
return document.querySelectorAll('.result').length;
})
.then(function (num) {
if(!num){
return ph.type('#search_form_input', 'phantomjs')
.click('#search_button')
.wait()
.screenshot('screenshot.png');
}
return ph.screenshot('screenshot.png');
})
.finally(function () {
ph.exit();
});
Create a new instance, initiates the phantomjs instance
The available options are:
diskCache: [true|false]
: enables disk cache (default isfalse
).ignoreSslErrors: [true|false]
: ignores SSL errors, such as expired or self-signed certificate errors (default istrue
).loadImages: [true|false]
: load all inlined images (default istrue
).localStoragePath: '/some/path'
: path to save LocalStorage content and WebSQL content (no default).localStorageQuota: [Number]
: maximum size to allow for data (no default).localToRemoteUrlAccess: [true|false]
: allows local content to access remote URL (default isfalse
).maxDiskCacheSize: [Number]
: limits the size of disk cache in KB (no default).binary
: specify a different custom path to PhantomJS (no default).port: [Number]
: specifies the phantomjs port.proxy: 'address:port'
: specifies the proxy server to use (e.g.proxy: '192.168.1.42:8080'
) (no default).proxyType: [http|socks5|none]
: specifies the type of the proxy server (default ishttp
) (no default).proxyAuth
: specifies the authentication information for the proxy, e.g.proxyAuth: 'username:password'
) (no default).sslProtocol: [sslv3|sslv2|tlsv1|any]
sets the SSL protocol for secure connections (default isany
).sslCertificatesPath: '/some/path'
Sets the location for custom CA certificates (if none set, uses systemdefault
).timeout [Number]
: how long to wait for page loads in ms (default is5000
).webSecurity: [true|false]
: enables web security and forbids cross-domain XHR (default istrue
).
These options can be passed into new Phantasma(options)
, alternatively they can be set individually afterwards using the .pageSetting(setting, value)
method.
javascriptEnabled: [true|false]
: defines whether to execute the script in the page or not (defaults totrue
).loadImages: [true|false]
: defines whether to load the inlined images or not (defaults totrue
).localToRemoteUrlAccessEnabled: [true|false]
: defines whether local resource (e.g. from file) can access remote URLs or not (defaults tofalse
).userAgent: String
: defines the user agent sent to server when the web page requests resources.userName: String
: sets the user name used for HTTP authentication.password: String
: sets the password used for HTTP authentication.XSSAuditingEnabled: [true|false]
: defines whether load requests should be monitored for cross-site scripting attempts (defaults tofalse
).webSecurityEnabled: [true|false]
: defines whether web security should be enabled or not (defaults totrue
).resourceTimeout: Number
: (in milli-secs) defines the timeout after which any resource requested will stop trying and proceed with other parts of the page.onResourceTimeout
event will be called on timeout.
Load the page at url
. Will throw a Timeout error if it takes longer to complete than the timeout setting.
Wait until a page finishes loading, typically after a .click()
. Will throw a Timeout error if it takes longer to complete than the timeout setting.
Close the phantomjs process.
Clicks the selector
element.
Clicks at the position given.
Enters the text
provided into the selector
element.
Sets the text
provided as the value of the selector
element.
Sets the value
of a select element to value
.
Invokes fn
on the page with arg1, arg2,...
. All the args
are optional. On completion it passes the return value of fn
to the resolved promise. Example:
var Phantasma = require('phantasma');
var p1 = 1;
var p2 = 2;
var ph = new Phantasma();
ph.evaluate(function (param1, param2) {
// now we're executing inside the browser scope.
return param1 + param2;
}, p1, p2)
.then(function (result) {
// now we're inside Node scope again
console.log(result);
})
.finally(function () {
ph.exit();
});
Set the viewport dimensions
Saves a screenshot of the current page to the specified path
. Useful for debugging. Note the path must include the file extension. Supported formats include .png, .gif, .jpeg, and .pdf.
Saves an screenshot of an specific DOM element as image to the specified path
.Note the path must include the file extension. Supported formats include .png, .gif, .jpeg, and .pdf.
Get the title of the current page, the result is passed to the resolved promise.
Get the url of the current page, the result is passed to the resolved promise.
Go back to the previous page. This will .wait()
untill the page has loaded.
Go forward to the next page. This will .wait()
untill the page has loaded.
refresh the current page. This will .wait()
untill the page has loaded.
Focus the selector
element.
Inject javascript at path
into the currently open page.
Inject CSS string style
into the currently open page.
Get or set the content of the page, if html
is set it will set, if not it will get.
Set a page setting.
Events extends node's EventEmitter.
Executes callback
when the event
is emitted.
Example:
var Phantasma = require('phantasma');
var ph = new Phantasma();
ph.open('https://duckduckgo.com')
.type('#search_form_input_homepage', 'phantomjs')
.click('#search_button_homepage')
.wait()
.catch(function (e) {
console.log('error', e);
})
.finally(function () {
console.log('done!');
ph.exit();
}).on('onUrlChanged', function (url) {
console.log('url change', url);
});
Executes callback
when the event
is emitted only once.
Removes callback
from event
listener.
Supports the following phantomjs events, you can read more on these here (PhantomJS callbacks):
onAlert
- callback(msg)onConsoleMessage
- callback(msg, lineNum, sourceId)onError
- callback(msg, trace)onLoadFinished
- callback(status)onLoadStarted
- callback()onNavigationRequested
- callback(url, type, willNavigate, main)onResourceReceived
- callback(response)onResourceRequested
- callback(requestData, networkRequest)onResourceTimeout
- callback(request)onUrlChanged
- callback(url)
You can use any of the methods available to bluebird found here.
The most useful methods are:
Returns a new promise chained from the previous promise. The return value of the previous promise will be passed into this promise.
Pass a handler that will be ran regardless of the outcome of the previous promises. Useful for cleaning up the Phantasma process e.g.
.finally(function () {
ph.exit();
});
This is a catch-all exception handler - it can be used to find and log an error. e.g.
.catch(function (e) {
console.log(e);
});
Delay the next promise for ms
milliseconds
Copyright (c) 2014, Pete Cooper - pete@petecoop.co.uk
Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.