Driving Google Chrome via WebSocket API
Instrumenting a browser is not for the faint of heart: half a dozen of different API's, different IPC mechanisms, and different capabilities from each vendor. Projects like WebDriver try to abstract this complexity for us, and you can also find dozens other "headless" drivers leveraging WebKit or similar engines. There is now even a W3C WebDriver spec in the works.
Instrumenting Google Chrome
However, while creating a generic solution is a hard task, turns out that instrumenting Chrome is a breeze - as I recently discovered while investigating some network latency questions. As of version 18, Chrome now supports v1.0 of Remote Debugging Protocol, which exposes the full capabilities of the browser via a regular WebSocket!
$> /Applications/Path To/Google Chrome --remote-debugging-port=9222 # OSX
$> curl localhost:9222/json
[ {
"devtoolsFrontendUrl": "/devtools/devtools.html?host=localhost:9222&page=1",
"faviconUrl": "",
"thumbnailUrl": "/thumb/chrome://newtab/",
"title": "New Tab",
"url": "chrome://newtab/",
"webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/1"
} ]
First, we enable remote debugging on Chrome (off by default). From there, Chrome exposes an HTTP handler, which allows us to inspect all of the open tabs. Each tab is an isolated process and hence inherits its own websocket, the path for which is the webSocketDebuggerUrl
key. With that, let's put it all together:
require 'em-http'
require 'faye/websocket'
require 'json'
EM.run do
# Chrome runs an HTTP handler listing available tabs
conn = EM::HttpRequest.new('http://localhost:9222/json').get
conn.callback do
resp = JSON.parse(conn.response)
puts "#{resp.size} available tabs, Chrome response: \n#{resp}"
# connect to first tab via the WS debug URL
ws = Faye::WebSocket::Client.new(resp.first['webSocketDebuggerUrl'])
ws.onopen = lambda do |event|
# once connected, enable network tracking
ws.send JSON.dump({id: 1, method: 'Network.enable'})
# tell Chrome to navigate to twitter.com and look for "chrome" tweets
ws.send JSON.dump({
id: 2,
method: 'Page.navigate',
params: {url: 'http://twitter.com/#!/search/chrome?q=chrome&' + rand(100).to_s}
})
end
ws.onmessage = lambda do |event|
# print event notifications from Chrome to the console
p [:new_message, JSON.parse(event.data)]
end
end
end
In this example we tell Chrome to enable network tracking and notifications, and then tell it to perform a Twitter search. With that, Chrome will forward us dozens of network notifications: initial page fetch, notifications for each subresource, XHR's, and so on (ex, Network.responseReceived
event). In fact, if you leave the page running, you will also see the long-poll events firing to fetch the latest tweets. Tons of information, all at your disposal.
Remote Debugging (and more) with Chrome
The example above illustrates a very simple interaction with the Network API, but the protocol exposes much more. You can drive the JS debugger, control the V8 VM, modify and inspect the DOM, and track Timeline events amongst half a dozen other capabilities. Finally, while driving a desktop browser is cool, driving a browser on your phone is even nicer: Chrome for Android provides all the same capabilities.