Lobsters Scraper

Name: Lobsters Scraper
Author: epctex

by epctex

Scrape Lobste.rs posts and users based on any search criteria. Retrieve all the comments, domains, tags, titles, number of upvotes, and published date...

2,886 runs

4 users

Try This Actor

Opens on Apify.com

About Lobsters Scraper

Scrape Lobste.rs posts and users based on any search criteria. Retrieve all the comments, domains, tags, titles, number of upvotes, and published dates. Use this extremely fast actor to retrieve all the information right away. Easy use and no limits!

What does this actor do?

Lobsters Scraper is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

Cloud-based execution - no local setup required
Scalable infrastructure for large-scale operations
API access for integration with your applications
Built-in proxy rotation and anti-blocking measures
Scheduled runs and webhooks for automation

How to Use

Click "Try This Actor" to open it on Apify
Create a free Apify account if you don't have one
Configure the input parameters as needed
Run the actor and download your results

Documentation

Actor - Lobsters Scraper ## Lobsters scraper Since Lobste.rs doesn't provide a good and free API, this actor should help you to retrieve data from it. The Lobsters data scraper supports the following features: - Search any keyword - You can search any keyword you would like to have and get the results - Scrape domains - Get all the posts from each of the domains that are represented in lobste.rs. - Get posts by tags - Scraping the results by a certain tag is doable! - Retrieve user detail - If you are looking for specific user details, you are in the right place. - Fetch comments of any post - All the comments that have been shared under a post are also included inside the search results. - Get active and recent posts - Don't get outdated! Active and recent posts can be harvested right away from the Lobsters. ## Bugs, fixes, updates, and changelog This scraper is under active development. If you have any feature requests you can create an issue from here. ## Upcoming Features - Integrate hierarchical comment tree structure. ## Input Parameters The input of this scraper should be JSON containing the list of pages on Lobsters that should be visited. Possible fields are: - `search`: (Optional) (String) Keyword that you want to search on Lobsters. - `startUrls`: (Optional) (Array) List of Lobsters URLs. You should only provide domains, tags, user detail, post detail, active posts, recent posts, or search URLs. - `endPage`: (Optional) (Number) Final number of page that you want to scrape. The default is `Infinite`. This applies to all `search` requests and `startUrls` individually. - `maxItems`: (Optional) (Number) You can limit scraped items. This should be useful when you search through the big lists or search results. - `proxy`: (Required) (Proxy Object) Proxy configuration. - `extendOutputFunction`: (Optional) (String) Function that takes a JQuery handle ($) as an argument and returns an object with data. - `customMapFunction`: (Optional) (String) Function that takes each object's handle as an argument and returns the object with executing the function. This solution requires the use of Proxy servers, either your own proxy servers or you can use Apify Proxy. ### Tip When you want to scrape over a specific list URL, just copy and paste the link as one of the startUrl. If you would like to scrape only the first page of a list then put the link for the page and have the `endPage` as 1. With the last approach that is explained above you can also fetch any interval of pages. If you provide the 5th page of a list and define the `endPage` parameter as 6 then you'll have the 5th and 6th pages only. ### Compute Unit Consumption The actor is optimized to run blazing fast and scrape as many items as possible. Therefore, it forefronts all the detailed requests. If the actor doesn't block very often it'll scrape 100 listings in 30 seconds with ~0.01-0.02 compute units. ### Lobsters Scraper Input example `json { "startUrls": [ "https://lobste.rs/domains/google.com", "https://lobste.rs/t/devops", "https://lobste.rs/u/lambda", "https://lobste.rs/active", "https://lobste.rs/recent", "https://lobste.rs/search?q=google&what=stories&order=newest" ], "maxItems":10, "endPage":2, "proxy":{ "useApifyProxy":true } }` ## During the Run During the run, the actor will output messages letting you know what is going on. Each message always contains a short label specifying which page from the provided list is currently specified. When items are loaded from the page, you should see a message about this event with a loaded item count and total item count for each page. If you provide incorrect input to the actor, it will immediately stop with a failure state and output an explanation of what is wrong. ## Lobsters Export During the run, the actor stores results into a dataset. Each item is a separate item in the dataset. You can manage the results in any language (Python, PHP, Node JS/NPM). See the FAQ or our API reference to learn more about getting results from this Lobsters actor. ## Scraped Lobsters Properties The structure of each item in Lobsters looks like this: ### User Detail `json { "type": "user", "name": "lambda", "url": "https://lobste.rs/u/lambda", "avatar": "https://lobste.rs/avatars/lambda-100.png", "status": "Active user", "homepage": "https://maxcountryman.com", "github": "https://github.com/maxcountryman", "about": "Indie hacker and people-first leader. Building https://remotejobs.com in public on Twitter.", "joinedAt": "2013-12-30 09:46:46 -0600", "karma": "392", "numberOfComments": "10", "numberOfStories": "28" }` ### Post Detail json { "type": "post", "id": "kour63", "url": "https://lobste.rs/s/kour63/help_test_cargo_s_new_index_protocol", "title": "Help test Cargo's new index protocol", "link": "https://blog.rust-lang.org/inside-rust/2023/01/30/cargo-sparse-protocol.html", "numberOfUpvotes": 13, "userName": "icefox", "userLink": "https://lobste.rs/u/icefox", "domain": "blog.rust-lang.org", "date": "2023-03-09 12:24:32 -0600", "tags": [ "devops", "rust" ], "comments": [ { "id": "dudcdn", "body": "Rust 1.68.0 has been released so this is now usable in stable Rust too. Still opt-in though. https://blog.rust-lang.org/2023/03/09/Rust-1.68.0.html", "numberOfUpvotes": 3, "date": "2023-03-09 16:48:32 -0600", "userLink": "https://lobste.rs/u/wezm", "userName": "wezmlink" } ] } ## Contact Please visit us through epctex.com to see all the products that are available for you. If you are looking for any custom integration or so, please reach out to us through the chat box in epctex.com. In need of support? business@epctex.com is at your service.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Lobsters Scraper now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer: epctex
Pricing: Paid
Total Runs: 2,886
Active Users: 4

Related Actors

Smart Article Extractor

by lukaskrivka

Google Search

by devisty

Twitter Tweets Scraper

by gentle_cloud

Twitter Profile

by danek

Browse All Actors

Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support