Instagram User Posts Dataset (Full History) cookieless

by patient_discovery

Extract high-fidelity Instagram post metadata with granular engagement metrics. Captures hidden fields, comprehensive timestamps, and precise interact...

1 runs
1 users
Try This Actor

Opens on Apify.com

About Instagram User Posts Dataset (Full History) cookieless

Extract high-fidelity Instagram post metadata with granular engagement metrics. Captures hidden fields, comprehensive timestamps, and precise interaction data for advanced social media performance analysis and strategic content optimization.

What does this actor do?

Instagram User Posts Dataset (Full History) cookieless is a web scraping and automation tool available on the Apify platform. It's designed to help you extract data and automate tasks efficiently in the cloud.

Key Features

  • Cloud-based execution - no local setup required
  • Scalable infrastructure for large-scale operations
  • API access for integration with your applications
  • Built-in proxy rotation and anti-blocking measures
  • Scheduled runs and webhooks for automation

How to Use

  1. Click "Try This Actor" to open it on Apify
  2. Create a free Apify account if you don't have one
  3. Configure the input parameters as needed
  4. Run the actor and download your results

Documentation

Instagram User Posts Dataset (Full History) ## Overview This actor performs a deep extraction of Instagram post data, capturing comprehensive engagement metrics, content metadata, and audience demographics from user profiles. The extraction pipeline ensures data integrity through structured validation and timestamp verification, delivering a complete historical record of post performance suitable for quantitative analysis and machine learning applications. ## Data Dictionary | Field Name | Data Type | Definition | |-----------|-----------|------------| | post_id | String | Unique Instagram post identifier assigned by the platform | | external_id | String | Internal tracking identifier for cross-reference and deduplication | | username | String | Instagram handle of the account that published the post | | scraped_at | String (ISO 8601) | UTC timestamp indicating when the data extraction occurred | | post_type | String | Content format classification (e.g., carousel, image, video, reel) | | caption | String | User-generated text content accompanying the post | | language_code | String | ISO 639-1 two-letter language code detected in caption | | is_verified | Boolean | Indicates whether the account has Instagram verification status (blue check) | | media_count | Integer | Number of media items included in the post (relevant for carousels) | | engagement_metrics | Object | Nested object containing quantitative interaction data | | engagement_metrics.likes | Integer | Total number of likes received on the post | | engagement_metrics.comments | Integer | Total number of comments on the post | | engagement_metrics.shares | Integer | Number of times the post was shared via direct message or story | | engagement_metrics.saves | Integer | Number of users who bookmarked the post | | engagement_metrics.reach | Integer | Estimated unique accounts that viewed the post | | location | Object | Geographic metadata associated with the post | | location.id | String | Instagram location identifier | | location.name | String | Human-readable location name | | location.lat | Float | Latitude coordinate in decimal degrees | | location.lng | Float | Longitude coordinate in decimal degrees | | hashtags | Array[String] | List of hashtags extracted from caption (without # symbol) | | mentions | Array[String] | List of @-mentioned usernames in caption | | sentiment_score | Float | Computed sentiment polarity score ranging from -1.0 (negative) to 1.0 (positive) | | accessibility_caption | String | Auto-generated or user-provided alt text describing visual content | | is_sponsored | Boolean | Indicates whether the post is marked as paid partnership content | | content_category | String | Classified content vertical (e.g., Technology, Fashion, Food) | | engagement_rate | Float | Calculated percentage: (likes + comments) / reach × 100 | | audience_demographics | Object | Aggregated viewer demographic data | | audience_demographics.age_range | String | Dominant age bracket of engaged users | | audience_demographics.top_locations | Array[String] | ISO 3166-1 alpha-2 country codes of primary audience | | audience_demographics.gender_split | Object | Percentage distribution of audience by gender | | audience_demographics.gender_split.male | Float | Percentage of male-identified viewers | | audience_demographics.gender_split.female | Float | Percentage of female-identified viewers | ## Sample Dataset Below is a sample of the high-fidelity JSON output: json { "post_id": "987654321098765", "external_id": "IGP_2025121945X8B7C", "username": "tech_explorer", "scraped_at": "2025-12-19T15:30:22Z", "post_type": "carousel", "caption": "Exploring the future of AI #techinnovation", "language_code": "en", "is_verified": true, "media_count": 3, "engagement_metrics": { "likes": 7823, "comments": 342, "shares": 156, "saves": 891, "reach": 45678 }, "location": { "id": "567891234", "name": "Silicon Valley", "lat": 37.4419, "lng": -122.1419 }, "hashtags": ["techinnovation", "ai", "future"], "mentions": ["@techweekly", "@airesearch"], "sentiment_score": 0.87, "accessibility_caption": "Person demonstrating new technology in lab setting", "is_sponsored": false, "content_category": "Technology", "engagement_rate": 4.52, "audience_demographics": { "age_range": "25-34", "top_locations": ["US", "UK", "IN"], "gender_split": { "male": 65.4, "female": 34.6 } } } ## Configuration Parameters To ensure optimal data depth, configure the following: | Parameter | Field Name | Data Type | Required | Example | Description | |-----------|-----------|-----------|----------|---------|-------------| | Username | username | String | Yes | Ronaldo | Instagram username, user ID, or profile URL to extract post history from | ## Analytical Use Cases Researchers and data scientists can leverage this dataset for: - Sentiment Analysis: Correlate sentiment scores with engagement metrics to identify content tone preferences across audience segments - Content Performance Modeling: Build predictive models using post_type, hashtags, and timing variables to forecast engagement rates - Audience Segmentation: Cluster posts by audience_demographics to identify distinct follower cohorts and their content preferences - Longitudinal Studies: Track engagement_rate trends over time using scraped_at timestamps to measure account growth trajectories - Network Mapping: Construct mention graphs from the mentions array to visualize influencer collaboration networks - Geographic Analysis: Map location data to identify regional content performance patterns and optimize posting strategies by geography - A/B Testing: Compare is_sponsored posts against organic content to quantify paid promotion effectiveness ## Technical Limitations Important Considerations: - Rate Limiting: Instagram's API enforces request throttling; bulk extractions may require distributed execution across multiple IP addresses - Data Freshness: Engagement metrics reflect point-in-time snapshots at scraped_at; real-time updates require continuous polling - Private Accounts: Extraction is limited to public profiles; private accounts return null datasets unless authenticated access is granted - Historical Depth: Post history retrieval is bounded by Instagram's pagination limits (typically 2,000-5,000 posts per profile) - Demographic Accuracy: audience_demographics represents aggregated estimates and may not reflect individual post viewer composition - Deleted Content: Posts removed after initial scraping will not appear in subsequent extractions; maintain versioned datasets for completeness - Verification Status: is_verified reflects account status at extraction time and may change without historical tracking --- Keywords & Tags: This dataset supports workflows involving instagram scraper, instagram data extractor, instagram user posts, extract instagram posts, export instagram posts, instagram lead generation, and social media scraping tool applications for comprehensive social media analytics.

Common Use Cases

Market Research

Gather competitive intelligence and market data

Lead Generation

Extract contact information for sales outreach

Price Monitoring

Track competitor pricing and product changes

Content Aggregation

Collect and organize content from multiple sources

Ready to Get Started?

Try Instagram User Posts Dataset (Full History) cookieless now on Apify. Free tier available with no credit card required.

Start Free Trial

Actor Information

Developer
patient_discovery
Pricing
Paid
Total Runs
1
Active Users
1
Apify Platform

Apify provides a cloud platform for web scraping, data extraction, and automation. Build and run web scrapers in the cloud.

Learn more about Apify

Need Professional Help?

Couldn't solve your problem? Hire a verified specialist on Fiverr to get it done quickly and professionally.

Find a Specialist

Trusted by millions | Money-back guarantee | 24/7 Support