Grep the web with Fresno: a command line for Firefox

Ben from Simile points me to Fresno, a tool that connects to a MozRepl-equipped Firefox and drives it from the command line.

Fresno can make a running Firefox navigate to URLs, load JavaScript files, and execute JavaScript commands. It keeps the browser as the execution context or changes it to the currently loaded web page or arbitrary objects. This example from the documentation retrieves links from a web page:


  % ./fresno -p http://simile.mit.edu/ -c \
    -j "document.getElementById('slideshow').innerHTML" \
    | grep href
            <div class="title"><a href="semantic-bank/">Semantic Bank</a></div>
            <div class="title"><a href="gadget/">Gadget</a></div>
            <div class="title"><a href="welkin/">Welkin</a></div>

            <div class="title"><a href="timeline/">Timeline</a></div>
            <div class="title"><a href="referee/">Referee</a></div>
            <div class="title"><a href="babel/">Babel</a></div>

            <div class="title"><a href="exhibit/">Exhibit</a></div>
            <div class="title"><a href="appalachian/">Appalachian</a></div>

I’m pleased to report that, despite being surrounded by nothing else than ink-black X terminals, the little red panda is starting to feel very much at home on my Unix desktop.

Update 2007-07-04: ZIGOROu also points me to his MozRepl Perl module!

Sweet. I’ve updated the post and linked both on MozRepl page, later I’ll add a wiki page for each. Thanks!

I'm looking for any scripts or javascripts or special ones to capture one special target of content in a web then copy it into my local database... dya have any tips or solution to help me ? tx

Any chance to have a MozRepl in python? That would be great...

M.E.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

Captcha
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
1 + 0 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
Syndicate content