Powershell script that grap books list form www.free-ebooks-download.org

Don’t use it too much.
 
$wc = New-Object System.Net.WebClient
$homePage = $wc.DownloadString("http://www.free-ebooks-download.org/")
$regex = [regex]’http://www.free-ebooks-download.org/free-ebook/%5B-\.\w/\s]*’
$categoryMatches = $regex.Matches($homePage)
foreach($cm in $categoryMatches)
{
    $categoryPage = $wc.DownloadString($cm.Value);
    $pageRegex = [regex]’&nbsp;<a href="([-\w.]+)">’;
    $pageMatches = $pageRegex.Matches($categoryPage);
    foreach($pm in $pageMatches)
    {
        $bookPage = $wc.DownloadString($cm.Value + $pm.Groups[1].Value);
        $bookRegex = [regex]’class="main" href="[^"]+" title="([^"]+)">’;
        $bookMatches = $bookRegex.Matches($bookPage);
        foreach($bm in $bookMatches)
        {
            echo $bm.Groups[1].Value
        }
    }
}
Advertisements
This entry was posted in 计算机与 Internet. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s