Researchgpt Deep Research Agent
by wheat_tourist
🔬 Transform any topic into a comprehensive research report in minutes! Scrapes Wikipedia, arXiv, Semantic Scholar, news & web sources. Outputs profess...
Opens on Apify.com
About Researchgpt Deep Research Agent
🔬 Transform any topic into a comprehensive research report in minutes! Scrapes Wikipedia, arXiv, Semantic Scholar, news & web sources. Outputs professional JSON, HTML & PDF reports. Perfect for students, researchers, content creators & businesses. No API keys needed.
What does this actor do?
Researchgpt Deep Research Agent is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.
Key Features
- Cloud-based execution - no local setup required
- Scalable infrastructure for large-scale operations
- API access for integration with your applications
- Built-in proxy rotation and anti-blocking measures
- Scheduled runs and webhooks for automation
How to Use
- Click "Try This Actor" to open it on Apify
- Create a free Apify account if you don't have one
- Configure the input parameters as needed
- Run the actor and download your results
Documentation
🔬 ResearchGPT - Deep Research Agent ### Transform any topic into a comprehensive research report in minutes, not hours.
--- ## 🎯 What is ResearchGPT? ResearchGPT is your AI-powered research assistant that does in 3 minutes what would take you 3+ hours manually. Simply enter any topic, and ResearchGPT will: ✅ Search across multiple engines (DuckDuckGo, Brave, Mojeek) ✅ Scrape Wikipedia, arXiv, Semantic Scholar, OpenAlex, CrossRef ✅ Extract the latest news articles and web content ✅ Process everything with intelligent NLP analysis ✅ Generate beautiful reports in JSON, HTML & PDF formats No API keys required. No complex setup. Just results. --- ## 🚀 Perfect For | Use Case | How ResearchGPT Helps | |----------|----------------------| | 📚 Students & Academics | Literature reviews, thesis research, citation gathering | | ✍️ Content Creators | Blog research, fact-checking, source compilation | | 💼 Business Analysts | Market research, competitive analysis, trend reports | | 🔬 Researchers | Cross-referencing sources, academic paper aggregation | | 📰 Journalists | Background research, source verification, story development | | 🤖 AI/ML Projects | Training data collection, knowledge base building | --- ## ⚡ Quick Start ### 1. Run on Apify (Easiest) 1. Go to the ResearchGPT Actor page 2. Enter your research topic 3. Click Start 4. Download your reports! 📄 ### 2. Via API bash curl -X POST "https://api.apify.com/v2/acts/YOUR_USERNAME~researchgpt-deep-research-agent/runs?token=YOUR_TOKEN" \ -H "Content-Type: application/json" \ -d '{"topic": "quantum computing breakthroughs 2025"}' ### 3. Via Apify SDK (Python) python from apify_client import ApifyClient client = ApifyClient("YOUR_API_TOKEN") run = client.actor("YOUR_USERNAME/researchgpt-deep-research-agent").call( run_input={"topic": "artificial intelligence in healthcare"} ) # Get results for item in client.dataset(run["defaultDatasetId"]).iterate_items(): print(item) --- ## 📊 What You Get ### Three Professional Output Formats | Format | Best For | Contents | |--------|----------|----------| | 📄 JSON | Developers, APIs, databases | Full structured data with metadata | | 🌐 HTML | Web publishing, sharing | Beautifully styled report with CSS | | 📑 PDF | Printing, presentations | Clean, professional document | ### Rich Research Data json { "topic": "artificial intelligence ethics", "sources": { "wikipedia": 5, "academic": 10, "news": 5, "general": 10 }, "processed_content": { "summary": "Comprehensive executive summary...", "key_findings": ["Finding 1", "Finding 2", "..."], "themes": ["Theme 1", "Theme 2", "..."], "entities": ["Entity 1", "Entity 2", "..."] } } --- ## 🔧 Configuration Options json { "topic": "your research topic here", "outputFormats": ["json", "html", "pdf"], "maxSourcesPerType": 10, "includeWikipedia": true, "includeAcademic": true, "includeNews": true, "includeGeneral": true, "searchProviders": ["duckduckgo"], "requestTimeout": 30, "maxRetries": 3, "debug": false } ### Parameter Reference | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | topic | string | required | 🎯 Your research topic or question | | outputFormats | array | ["json", "html", "pdf"] | 📄 Output formats to generate | | maxSourcesPerType | integer | 10 | 📊 Sources per category (1-20) | | includeWikipedia | boolean | true | 📖 Include Wikipedia articles | | includeAcademic | boolean | true | 🎓 Include academic papers | | includeNews | boolean | true | 📰 Include news articles | | includeGeneral | boolean | true | 🌐 Include general web content | | searchProviders | array | ["duckduckgo"] | 🔍 Search engines to use | | requestTimeout | integer | 30 | ⏱️ Request timeout (seconds) | | maxRetries | integer | 3 | 🔄 Retry attempts on failure | | proxyConfiguration | object | null | 🛡️ Apify proxy settings | | debug | boolean | false | 🐛 Enable verbose logging | --- ## 🌐 Data Sources ResearchGPT taps into 6+ authoritative sources: | Source | Type | What You Get | |--------|------|--------------| | 🌍 Wikipedia | Knowledge Base | Foundational articles via MediaWiki API | | 📚 arXiv | Academic | Pre-print papers in physics, CS, math, and more | | 🔬 Semantic Scholar | Academic | 200M+ papers with citation analysis | | 📖 OpenAlex | Academic | Open catalog of scholarly works | | 📑 CrossRef | Academic | DOI metadata and citations | | 📰 News Sources | Current Events | Latest articles via smart extraction | | 🌐 General Web | Insights | Curated web content with readability algorithms | --- ## 🛡️ Production-Grade Features Built for reliability and scale: - ⚡ Smart Caching - 5-minute TTL prevents redundant requests - 🔄 Retry Logic - Exponential backoff with jitter - 🚦 Rate Limiting - Respects API limits automatically - 🔍 Deduplication - MD5-based content fingerprinting - 🌐 Connection Pooling - Efficient HTTP management - 🛡️ Error Handling - Graceful fallbacks, never crashes --- ## 💻 Local Development bash # Clone the repository git clone https://github.com/your-repo/researchgpt-deep-research-agent cd researchgpt-deep-research-agent # Create virtual environment python -m venv .venv .venv\Scripts\activate # Windows source .venv/bin/activate # macOS/Linux # Install dependencies pip install -r requirements.txt # Run locally python run_local.py --- ## 📁 Project Structure researchgpt-deep-research-agent/ ├── 📂 .actor/ │ └── actor.json # Apify configuration ├── 📂 src/ │ ├── __init__.py │ └── main.py # 🚀 Main Apify entry point ├── 📂 scrapers/ │ ├── base_scraper.py # Base class with retry/caching │ ├── academic_scraper.py # arXiv, Semantic Scholar, etc. │ ├── wikipedia_scraper.py # MediaWiki API │ ├── news_scraper.py # News extraction │ ├── general_scraper.py # Web scraping │ └── search_engine.py # Multi-provider search ├── 📂 processors/ │ └── content_processor.py # NLP processing ├── 📂 output/ │ └── output_generator.py # Report generation ├── 📄 Dockerfile # Container definition ├── 📄 requirements.txt # Dependencies └── 📄 README.md # You are here! --- ## 🚀 Deploy to Apify ### Option 1: Apify CLI (Recommended) bash npm install -g apify-cli apify login apify push ### Option 2: GitHub Integration 1. Push to GitHub 2. Apify Console → Create Actor → Link to GitHub 3. Auto-builds on every push! 🔄 --- ## 📈 Performance Tips | Tip | Impact | |-----|--------| | Lower maxSourcesPerType | ⚡ Faster results | | Disable unused sources | 🚀 Skip what you don't need | | Use single search provider | 📉 Reduce API calls | | Enable debug mode | 🔍 Troubleshoot issues | --- ## 🤔 FAQ How long does it take?
Typically 1-3 minutes depending on topic complexity and number of sources. Do I need API keys?
No! ResearchGPT uses public APIs and smart scraping. No setup required. Can I use this for commercial projects?
Yes! MIT licensed. Use it however you want. What if a source is blocked?
ResearchGPT gracefully handles blocks and continues with other sources. Can I customize the output?
Yes! Choose your formats (JSON/HTML/PDF) and configure source types. --- ## 🆚 Why ResearchGPT? | Feature | ResearchGPT | Manual Research | Other Tools | |---------|-------------|-----------------|-------------| | ⏱️ Time | 3 minutes | 3+ hours | 30+ minutes | | 📚 Sources | 6+ databases | Limited | Usually 1-2 | | 📄 Output | JSON + HTML + PDF | Manual formatting | Single format | | 💰 Cost | Pay per run | Your time = $$$$ | Subscription | | 🔧 Setup | Zero | N/A | API keys needed | --- ## 📝 Example Topics Get inspired! Here are some topics that work great: - "artificial intelligence ethics and regulation 2025" - "quantum computing practical applications" - "climate change solutions renewable energy" - "cryptocurrency DeFi market analysis" - "remote work productivity research" - "mental health digital therapeutics" - "gene editing CRISPR medical applications" - "electric vehicles battery technology" --- ## 🤝 Support & Community - 🐛 Issues: Report bugs - 💡 Feature Requests: Suggest ideas - 📖 Docs: Apify Documentation - 💬 Discord: Join Apify Community --- ## 📄 License MIT License - Use it freely, commercially or personally. --- ### ⭐ Star this repo if ResearchGPT saved you time! Built with ❤️ for researchers everywhere Get Started • Documentation • Support
How long does it take?
Typically 1-3 minutes depending on topic complexity and number of sources.Do I need API keys?
No! ResearchGPT uses public APIs and smart scraping. No setup required.Can I use this for commercial projects?
Yes! MIT licensed. Use it however you want.What if a source is blocked?
ResearchGPT gracefully handles blocks and continues with other sources.Can I customize the output?
Yes! Choose your formats (JSON/HTML/PDF) and configure source types.Categories
Common Use Cases
Market Research
Gather competitive intelligence and market data
Lead Generation
Extract contact information for sales outreach
Price Monitoring
Track competitor pricing and product changes
Content Aggregation
Collect and organize content from multiple sources
Ready to Get Started?
Try Researchgpt Deep Research Agent now on Apify. Free tier available with no credit card required.
Start Free TrialActor Information
- Developer
- wheat_tourist
- Pricing
- Paid
- Total Runs
- 12
- Active Users
- 2
Related Actors
Google Search Results Scraper
by apify
Website Content Crawler
by apify
🔥 Leads Generator - $3/1k 50k leads like Apollo
by microworlds
Video Transcript Scraper: Youtube, X, Facebook, Tiktok, etc.
by invideoiq
Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.
Learn more about ApifyNeed Professional Help?
Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.
Trusted by millions | Money-back guarantee | 24/7 Support