Skip to content

Latest commit

 

History

History
308 lines (227 loc) · 11.3 KB

readme.md

File metadata and controls

308 lines (227 loc) · 11.3 KB

No longer maintained

I'd really recommend using Puppeteer, which solves the problem I was trying to solve here and is also really nice and quick.

Phantasma

Build Status NPM Version NPM Downloads

A high level promise based wrapper for PhantomJS

The aim is to make interacting with PhantomJS from node as simple as possible. All actions are asynchronous and return a bluebird promise. The promises have been extended with Phantasma methods, allowing for a fluent API.

This project is heavily influenced by Nightmare, but different - Nightmare queues up actions which are then exectued when .run() is called, once this is done phantomjs is exited. This is fine if you already know the actions you want to take however it's not possible to change the flow of actions mid-way e.g. if sometimes a popup/button appears on a page that you want to click before continuing with the next action. Phantasma takes a different approach - using promises which leaves queueing up to the promise library (bluebird) and leaves you in control of when to exit the phantomjs process.

Install

Examples

var Phantasma = require('phantasma');

var ph = new Phantasma();

ph.open('https://duckduckgo.com')
  .type('#search_form_input_homepage', 'phantomjs')
  .click('#search_button_homepage')
  .wait()
  .screenshot('screenshot.png')
  .evaluate(function () {
    return document.querySelectorAll('.result').length;
  })
  .then(function (num) {
    console.log(num + ' results');
  })
  .catch(function (e) {
    console.log('error', e);
  })
  .finally(function () {
    console.log('done!');
    ph.exit();
  });

Any of the above methods can be replaced with a .then e.g.

var Phantasma = require('phantasma');

var ph = new Phantasma();

ph.then(function () {
    return ph.open('https://duckduckgo.com');
  })
  .screenshot('screenshot.png')
  .finally(function () {
    ph.exit();
  });

This allows for conditionally changing the flow depending on the result of the last request:

var ph = new Phantasma();

ph.open('https://duckduckgo.com')
  .type('#search_form_input_homepage', 'akjsdhjashda')
  .click('#search_button_homepage')
  .wait()
  .evaluate(function () {
    return document.querySelectorAll('.result').length;
  })
  .then(function (num) {
    if(!num){
      return ph.type('#search_form_input', 'phantomjs')
        .click('#search_button')
        .wait()
        .screenshot('screenshot.png');
    }
    return ph.screenshot('screenshot.png');
  })
  .finally(function () {
    ph.exit();
  });

API

new Phantasma(options)

Create a new instance, initiates the phantomjs instance

The available options are:

  • diskCache: [true|false]: enables disk cache (default is false).
  • ignoreSslErrors: [true|false]: ignores SSL errors, such as expired or self-signed certificate errors (default is true).
  • loadImages: [true|false]: load all inlined images (default is true).
  • localStoragePath: '/some/path': path to save LocalStorage content and WebSQL content (no default).
  • localStorageQuota: [Number]: maximum size to allow for data (no default).
  • localToRemoteUrlAccess: [true|false]: allows local content to access remote URL (default is false).
  • maxDiskCacheSize: [Number]: limits the size of disk cache in KB (no default).
  • binary: specify a different custom path to PhantomJS (no default).
  • port: [Number]: specifies the phantomjs port.
  • proxy: 'address:port': specifies the proxy server to use (e.g. proxy: '192.168.1.42:8080') (no default).
  • proxyType: [http|socks5|none]: specifies the type of the proxy server (default is http) (no default).
  • proxyAuth: specifies the authentication information for the proxy, e.g. proxyAuth: 'username:password') (no default).
  • sslProtocol: [sslv3|sslv2|tlsv1|any] sets the SSL protocol for secure connections (default is any).
  • sslCertificatesPath: '/some/path' Sets the location for custom CA certificates (if none set, uses system default).
  • timeout [Number]: how long to wait for page loads in ms (default is 5000).
  • webSecurity: [true|false]: enables web security and forbids cross-domain XHR (default is true).

Page Settings

These options can be passed into new Phantasma(options), alternatively they can be set individually afterwards using the .pageSetting(setting, value) method.

  • javascriptEnabled: [true|false]: defines whether to execute the script in the page or not (defaults to true).
  • loadImages: [true|false]: defines whether to load the inlined images or not (defaults to true).
  • localToRemoteUrlAccessEnabled: [true|false]: defines whether local resource (e.g. from file) can access remote URLs or not (defaults to false).
  • userAgent: String: defines the user agent sent to server when the web page requests resources.
  • userName: String: sets the user name used for HTTP authentication.
  • password: String: sets the password used for HTTP authentication.
  • XSSAuditingEnabled: [true|false]: defines whether load requests should be monitored for cross-site scripting attempts (defaults to false).
  • webSecurityEnabled: [true|false]: defines whether web security should be enabled or not (defaults to true).
  • resourceTimeout: Number: (in milli-secs) defines the timeout after which any resource requested will stop trying and proceed with other parts of the page. onResourceTimeout event will be called on timeout.

Methods

.open(url)

Load the page at url. Will throw a Timeout error if it takes longer to complete than the timeout setting.

.wait()

Wait until a page finishes loading, typically after a .click(). Will throw a Timeout error if it takes longer to complete than the timeout setting.

.exit()

Close the phantomjs process.

.click(selector)

Clicks the selector element.

.click(x, y)

Clicks at the position given.

.type(selector, text)

Enters the text provided into the selector element.

.value(selector, text)

Sets the text provided as the value of the selector element.

.select(selector, value)

Sets the value of a select element to value.

.evaluate(fn, arg1, arg2,...)

Invokes fn on the page with arg1, arg2,.... All the args are optional. On completion it passes the return value of fn to the resolved promise. Example:

var Phantasma = require('phantasma');
var p1 = 1;
var p2 = 2;

var ph = new Phantasma();

ph.evaluate(function (param1, param2) {
    // now we're executing inside the browser scope.
    return param1 + param2;
  }, p1, p2)
  .then(function (result) {
    // now we're inside Node scope again
    console.log(result);
  })
  .finally(function () {
    ph.exit();
  });

.viewport(width, height)

Set the viewport dimensions

.screenshot(path)

Saves a screenshot of the current page to the specified path. Useful for debugging. Note the path must include the file extension. Supported formats include .png, .gif, .jpeg, and .pdf.

.screenshotDomElement(selector,path)

Saves an screenshot of an specific DOM element as image to the specified path.Note the path must include the file extension. Supported formats include .png, .gif, .jpeg, and .pdf.

.title()

Get the title of the current page, the result is passed to the resolved promise.

.url()

Get the url of the current page, the result is passed to the resolved promise.

.back()

Go back to the previous page. This will .wait() untill the page has loaded.

.forward()

Go forward to the next page. This will .wait() untill the page has loaded.

.refresh()

refresh the current page. This will .wait() untill the page has loaded.

.focus(selector)

Focus the selector element.

.injectJs(path)

Inject javascript at path into the currently open page.

.injectCss(style)

Inject CSS string style into the currently open page.

.content(html)

Get or set the content of the page, if html is set it will set, if not it will get.

.pageSetting(setting, value)

Set a page setting.

Events

Events extends node's EventEmitter.

.on(event, callback)

Executes callback when the event is emitted.

Example:

var Phantasma = require('phantasma');

var ph = new Phantasma();

ph.open('https://duckduckgo.com')
  .type('#search_form_input_homepage', 'phantomjs')
  .click('#search_button_homepage')
  .wait()
  .catch(function (e) {
    console.log('error', e);
  })
  .finally(function () {
    console.log('done!');
    ph.exit();
  }).on('onUrlChanged', function (url) {
    console.log('url change', url);
  });

.once(event, callback)

Executes callback when the event is emitted only once.

.removeListener(event, callback)

Removes callback from event listener.

Supported Events:

Supports the following phantomjs events, you can read more on these here (PhantomJS callbacks):

  • onAlert - callback(msg)
  • onConsoleMessage - callback(msg, lineNum, sourceId)
  • onError - callback(msg, trace)
  • onLoadFinished - callback(status)
  • onLoadStarted - callback()
  • onNavigationRequested - callback(url, type, willNavigate, main)
  • onResourceReceived - callback(response)
  • onResourceRequested - callback(requestData, networkRequest)
  • onResourceTimeout - callback(request)
  • onUrlChanged - callback(url)

Promise methods

You can use any of the methods available to bluebird found here.

The most useful methods are:

.then(fulfillHandler, rejectHandler)

Returns a new promise chained from the previous promise. The return value of the previous promise will be passed into this promise.

.finally(Function handler)

Pass a handler that will be ran regardless of the outcome of the previous promises. Useful for cleaning up the Phantasma process e.g.

.finally(function () {
  ph.exit();
});

.catch(Function handler)

This is a catch-all exception handler - it can be used to find and log an error. e.g.

.catch(function (e) {
  console.log(e);
});

.delay(ms)

Delay the next promise for ms milliseconds

License

ISC

Copyright (c) 2014, Pete Cooper - pete@petecoop.co.uk

Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.