Goodreads Books Scraper

Name: Goodreads Books Scraper
Author: shahidirfan

by shahidirfan

Efficiently extract detailed book data with the Goodreads Books Scraper. Ideal for building reading lists or analyzing metadata. Note: For bulk scrapi...

14 runs

2 users

Try This Actor

Opens on Apify.com

About Goodreads Books Scraper

Efficiently extract detailed book data with the Goodreads Books Scraper. Ideal for building reading lists or analyzing metadata. Note: For bulk scraping of more than 50 books, providing JSON cookies is essential to ensure seamless access and reliable results.

What does this actor do?

Goodreads Books Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Goodreads Book Scraper Extract comprehensive book data from Goodreads shelves including titles, authors, ratings, reviews, descriptions, ISBNs, genres, and publication details. Perfect for book analysis, market research, reading list creation, and literary data collection. ## What does the Goodreads Book Scraper do? The Goodreads Book Scraper enables you to extract detailed book information from any Goodreads shelf or category. Whether you're building a reading recommendation system, conducting market research, or creating a personal book database, this scraper provides all the data you need. ### Key capabilities: - 📚 Extract book details - Titles, authors, ratings, review counts, descriptions, and more - 🔄 Automatic pagination - Seamlessly navigate through multiple pages of results - ⚡ Fast & efficient - Lightweight design optimized for speed and reliability - 📊 Structured data - Clean JSON output ready for analysis or integration - 🎯 Flexible targeting - Scrape any Goodreads shelf by name or URL - 🔍 Two scraping modes - Quick overview or detailed book information ## Why scrape Goodreads? Goodreads is the world's largest community of book lovers with over 90 million members and data on millions of books. Access to this data enables: - Market research - Analyze book trends, popular genres, and reader preferences - Recommendation systems - Build personalized book recommendation engines - Content curation - Create reading lists and book collections - Price monitoring - Track book popularity for inventory decisions - Academic research - Study reading patterns and literary trends - Personal libraries - Organize and manage your reading lists ## How much does it cost to scrape Goodreads? The cost depends on the number of books you scrape and whether you enable detailed scraping. Here are typical usage estimates: - 100 books (basic) - ~0.01-0.02 Apify compute units - 100 books (detailed) - ~0.03-0.05 Apify compute units - 1,000 books (detailed) - ~0.30-0.50 Apify compute units Apify provides 5 USD of free credits monthly, enough to scrape thousands of books. For larger projects, paid plans start at $49/month. ## Input configuration Configure the scraper using these parameters: ### Basic settings

Start URL Direct URL to a Goodreads shelf (e.g., `https://www.goodreads.com/shelf/show/fantasy`)

Shelf Name Name of the shelf to scrape (e.g., `fantasy`, `science-fiction`, `bestsellers`)

Maximum Books Number of books to scrape (default: 100)

Maximum Pages Safety limit on pages to visit (default: 10)

### Advanced settings

Collect Details Enable to extract full book information including descriptions, ISBNs, and genres (default: enabled)

Cookies Authentication cookies for accessing paginated results (required for pages beyond the first)

Proxy Configuration Proxy settings (residential proxies recommended)

### Example input `json { "shelf": "fantasy", "results_wanted": 100, "max_pages": 5, "collectDetails": true, "proxyConfiguration": { "useApifyProxy": true, "apifyProxyGroups": ["RESIDENTIAL"] } }` ## Output format The scraper provides structured JSON data for each book: ### Basic output (without detailed scraping) `json { "title": "The Name of the Wind", "author": "Patrick Rothfuss", "rating": 4.52, "ratingCount": 985432, "reviewCount": 45678, "image": "https://i.gr-assets.com/images/S/...", "url": "https://www.goodreads.com/book/show/186074" }` ### Detailed output (with detailed scraping enabled) `json { "title": "The Name of the Wind", "author": "Patrick Rothfuss", "rating": 4.52, "ratingCount": 985432, "reviewCount": 45678, "description": "Told in Kvothe's own voice, this is the tale of the magically gifted young man...", "image": "https://i.gr-assets.com/images/S/...", "isbn": "0756404746", "publisher": "DAW Books", "publishDate": "March 27, 2007", "genres": ["Fantasy", "Fiction", "Magic", "Adventure"], "url": "https://www.goodreads.com/book/show/186074" }` ### Output fields

Field Type Description

title string Book title

author string Primary author name(s)

rating number Average rating (0-5 scale)

ratingCount number Total number of ratings

reviewCount number Total number of reviews

description string Book description/synopsis (detailed mode only)

image string URL to book cover image

isbn string ISBN identifier (detailed mode only)

publisher string Publisher name (detailed mode only)

publishDate string Publication date (detailed mode only)

genres array List of book genres/categories (detailed mode only)

url string Goodreads book URL

## How to use the Goodreads Book Scraper ### Using the Apify Console 1. Navigate to the Goodreads Book Scraper on Apify 2. Click Try for free 3. Enter your configuration: - Shelf name (e.g., "fantasy", "bestsellers") - Number of books you want to scrape - Toggle Collect Details for comprehensive data 4. Click Start to begin scraping 5. Download results in JSON, CSV, Excel, or HTML format ### Using the Apify API `javascript const Apify = require('apify-client'); const client = new Apify.ApifyClient({ token: 'YOUR_API_TOKEN', }); const run = await client.actor('YOUR_USERNAME/goodreads-book-scraper').call({ shelf: 'fantasy', results_wanted: 100, collectDetails: true, }); const { items } = await client.dataset(run.defaultDatasetId).listItems(); console.log(items);` ### Using as a standalone script 1. Clone this repository 2. Run `npm install` 3. Configure `INPUT.json` with your parameters 4. Run `npm start` ## Important notes on pagination ⚠️ Authentication requirement: Goodreads restricts pagination to authenticated users. Non-logged users can only access the first page (approximately 50 books). ### To access multiple pages: 1. Log in to Goodreads in your browser 2. Open DevTools (F12) → Network tab 3. Reload the page and find a request to goodreads.com 4. Copy the Cookie header from the request headers 5. Paste the cookie value into the "Authentication cookies" field The scraper will use your cookies to access paginated results. Pagination URLs follow this pattern: `https://www.goodreads.com/shelf/show/fantasy?page=2` ## Popular Goodreads shelves to scrape Get started quickly with these popular shelves: - `fantasy` - Fantasy fiction and magic - `science-fiction` - Sci-fi and speculative fiction - `romance` - Romance novels - `mystery` - Mystery and thriller books - `young-adult` - YA fiction - `classics` - Classic literature - `non-fiction` - Non-fiction works - `biography` - Biographies and memoirs - `history` - Historical works - `self-help` - Self-improvement books - `business` - Business books - `philosophy` - Philosophy texts You can find more shelves by browsing Goodreads Shelves. ## Scraping best practices ### Performance optimization - Set reasonable limits - Use `results_wanted` to control scraping volume - Enable detailed scraping selectively - Disable if you only need basic information - Use residential proxies - Required for accessing multiple pages - Implement rate limiting - The scraper includes built-in concurrency controls ### Data quality - Validate output - Check that all expected fields are populated - Handle missing data - Some books may have incomplete information - Monitor for changes - Goodreads may update their HTML structure ### Compliance - Respect robots.txt - The scraper follows Goodreads guidelines - Don't overload servers - Use appropriate concurrency settings - Review Terms of Service - Ensure your use case complies with Goodreads policies - Personal use recommended - Commercial use may require additional consideration ## Troubleshooting ### No books found on page 2+ Solution: You need to provide authentication cookies. See the pagination section above. ### Scraper returns incomplete data Solution: Enable "Collect Details" to fetch comprehensive book information. ### Rate limiting or blocked requests Solution: Use residential proxies and reduce concurrency if needed. ### Outdated selectors Solution: Goodreads occasionally updates their website. Contact support if selectors need updating. ## Use cases ### Market Research Analyze book trends, identify popular genres, and understand reader preferences to make data-driven publishing decisions. ### Recommendation Systems Build sophisticated book recommendation engines using ratings, genres, and reader reviews. ### Academic Research Study literary trends, analyze reading patterns, and conduct research on book popularity and cultural impact. ### Content Creation Create curated reading lists, book blogs, and literary content based on comprehensive book data. ### Personal Library Management Organize your reading lists, track books to read, and manage your personal book collection. ## Support Need help? Have questions? - Documentation: Check out the detailed Apify documentation - Community: Join the Apify Discord - Issues: Report bugs or request features on the GitHub repository ## Related actors Explore similar scrapers: - Amazon Book Scraper - Extract book data from Amazon - Barnes & Noble Scraper - Scrape B&N book listings - Google Books Scraper - Extract data from Google Books - Book Price Monitor - Track book prices across platforms --- Built with ❤️ for the reading community. Happy scraping!

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Goodreads Books Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: shahidirfan
Pricing: Paid
Total Runs: 14
Active Users: 2

Related Actors

Web Scraper

by apify

Cheerio Scraper

by apify

Website Content Crawler

by apify

Legacy PhantomJS Crawler

by apify

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support

Start URL	Direct URL to a Goodreads shelf (e.g., `https://www.goodreads.com/shelf/show/fantasy`)
Shelf Name	Name of the shelf to scrape (e.g., `fantasy`, `science-fiction`, `bestsellers`)
Maximum Books	Number of books to scrape (default: 100)
Maximum Pages	Safety limit on pages to visit (default: 10)

Collect Details	Enable to extract full book information including descriptions, ISBNs, and genres (default: enabled)
Cookies	Authentication cookies for accessing paginated results (required for pages beyond the first)
Proxy Configuration	Proxy settings (residential proxies recommended)

Field	Type	Description
title	string	Book title
author	string	Primary author name(s)
rating	number	Average rating (0-5 scale)
ratingCount	number	Total number of ratings
reviewCount	number	Total number of reviews
description	string	Book description/synopsis (detailed mode only)
image	string	URL to book cover image
isbn	string	ISBN identifier (detailed mode only)
publisher	string	Publisher name (detailed mode only)
publishDate	string	Publication date (detailed mode only)
genres	array	List of book genres/categories (detailed mode only)
url	string	Goodreads book URL

Goodreads Books Scraper

About Goodreads Books Scraper

What does this actor do?

Key Features

How to Use

Documentation

Categories

Common Use Cases

Market Research

Lead Generation

Price Monitoring

Content Aggregation

Ready to Get Started?

Actor Information

Related Actors

Need Professional Help?