· 6 years ago · Dec 01, 2019, 07:24 PM
1From: <Saved by Blink> Snapshot-Content-Location: https://pastebin.com/doc_scraping_api Subject: Pastebin.com - Scraping API Date: Mon, 7 Oct 2019 19:18:41 -0000 MIME-Version: 1.0 Content-Type: multipart/related; type="text/html"; boundary="----MultipartBoundary--lqaWwnDhy388ydh4RcOJ2o7Ko3s9fk9NrLVaZF9wOA----" ------MultipartBoundary--lqaWwnDhy388ydh4RcOJ2o7Ko3s9fk9NrLVaZF9wOA---- Content-Type: text/html Content-ID: <frame-1AAB9E34C257E1D57FB5EC4A7FE85F4F@mhtml.blink> Content-Transfer-Encoding: binary Content-Location: https://pastebin.com/doc_scraping_api <!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <!-- Global site tag (gtag.js) - Google Analytics --> <title>Pastebin.com - Scraping API</title> <link rel="shortcut icon" href="https://pastebin.com/favicon.ico"> <link href="https://pastebin.com/i/pastebin.min.v9.css" rel="stylesheet" type="text/css"> <!--[if lt IE 10]> <link href="/i/pastebin.ie8.css" rel="stylesheet" type="text/css" /> <![endif]--> <style>body{-webkit-text-size-adjust:none;}</style> <meta property="fb:app_id" content="231493360234820"> <meta property="og:title" content="Pastebin.com - Scraping API"> <meta property="og:type" content="article"> <meta property="og:url" content="https://pastebin.com/doc_scraping_api"> <meta property="og:image" content="https://pastebin.com/i/facebook.png"> <meta property="og:site_name" content="Pastebin"> <meta name="google-site-verification" content="jkUAIOE8owUXu8UXIhRLB9oHJsWBfOgJbZzncqHoF4A"> <link rel="canonical" href="https://pastebin.com/doc_scraping_api"> <meta name="viewport" content="width=device-width, initial-scale=0.75, maximum-scale=1.0, user-scalable=yes"> <link rel="preload" href="https://adservice.google.com.et/adsid/integrator.js?domain=pastebin.com" as="script"><link rel="preload" href="https://adservice.google.com/adsid/integrator.js?domain=pastebin.com" as="script"><link rel="prefetch" href="https://tpc.googlesyndication.com/safeframe/1-0-35/html/container.html"></head> <body> <div id="main_frame"> <div id="jq-dropdown-1" class="jq-dropdown jq-dropdown-anchor-right jq-dropdown-scroll"> <ul class="jq-dropdown-menu"> <li class="lih_640"> <form class="search_form_li" name="search_form_li" method="get" action="https://pastebin.com/search" id="cse-search-box-li"> <input class="search_input_li" type="text" name="q" size="5" value="" placeholder="search..."> </form> </li> <li class="lih_div"></li> <li class="dd_pa">My Pastebin</li> <li class="dd_me">My Messages [1]</li> <li class="dd_al">My Alerts</li> <li class="dd_pr">Edit Profile</li> <li class="dd_se">Edit Settings</li> <li class="dd_pw">Change Password</li> <li class="dd_lo">Log Out</li> <li class="lih_div"></li> <li class="lih_640">API</li> <li class="lih_640">FAQ</li> <li class="lih_640">Tools</li> <li class="lih_640">Archive</li> </ul> </div> <div id="header"> <div id="header_wrap"> <div id="header_top"> <div id="header_logo">PASTEBIN</div> <div id="header_new_paste" class="new_paste_button">new paste</div> <div id="header_links"> <a href="https://pastebin.com/api" class="mmh">API</a> <a href="https://pastebin.com/tools" class="mmh">tools</a> <a href="https://pastebin.com/faq" class="mmh">faq</a> <a href="https://deals.pastebin.com/" target="_blank" class="mmh">deals</a> </div> <div id="header_search"> <form class="search_form" name="search_form" method="get" action="https://pastebin.com/search" id="cse-search-box"> <input class="search_input" type="text" name="q" size="5" value="" placeholder="search..."> </form> </div> <div id="header_members"> <div id="header_dropdown" data-jq-dropdown="#jq-dropdown-1"> </div> <div id="header_icon"><a href="https://pastebin.com/u/utubemax"><img src="https://pastebin.com/i/t.gif" class="header_icon" title="My Pastebin" alt=""></a></div> <div id="header_user_frame"> <div id="header_username">utubemax</div> <div id="header_user_status">FREE</div> </div> <div id="header_icons"> <a href="https://pastebin.com/u/utubemax" title="My Pastebin"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_mypastebin" alt=""></a> <a href="https://pastebin.com/messages" title="My Messages"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_messages" alt=""></a> <span class="msg_numb">1</span> <a href="https://pastebin.com/alerts" title="My Alerts"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_alerts" alt=""></a> <a href="https://pastebin.com/settings" title="My Settings"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_settings" alt=""></a> </div> </div> </div> </div> </div> <div id="super_frame"> <div id="monster_frame"> <div id="content_frame"> <div id="content_right"> <div class="content_right_menu"> <div class="content_right_title"><a href="https://pastebin.com/u/utubemax">My Pastes</a></div> <div id="menu_1"> <ul class="right_menu"><li class="empty">Nothing here yet...</li></ul></div></div><div class="content_right_menu"> <div class="content_right_title"><a href="https://pastebin.com/archive">Public Pastes</a></div> <div id="menu_2"><ul class="right_menu"><li><a href="https://pastebin.com/Kefey64C">Untitled</a><span>1 sec ago</span></li><li><a href="https://pastebin.com/h4fWMWkg">Untitled</a><span>6 sec ago</span></li><li><a href="https://pastebin.com/mHKt4Yq0">Untitled</a><span>7 sec ago</span></li><li><a href="https://pastebin.com/kpbyvDPE">Untitled</a><span>8 sec ago</span></li><li><a href="https://pastebin.com/nz8ZgWf3">Untitled</a><span>12 sec ago</span></li><li><a href="https://pastebin.com/jMX9BzC7">Untitled</a><span>16 sec ago</span></li><li><a href="https://pastebin.com/3u4mTEYN">Untitled</a><span>23 sec ago</span></li><li><a href="https://pastebin.com/buPzNr5i">Untitled</a><span>25 sec ago</span></li></ul></div></div> <div id="abrpm2"></div> <div style="padding: 0; width:160px;margin: 10px 0;clear:left;"> <div align="center" id="richmedia_2"><iframe scrolling="no" frameborder="0" allowtransparency="true" height="600" width="160" style="border:0;" src="cid:frame-178FF1421FA83E298D33DFDD16EF6F08@mhtml.blink"></iframe></div> </div> <div id="steadfast" title="Pastebin is proudly hosted by Steadfast.net"></div> </div> <div id="content_left"><div id="ie_msg"></div> <div id="abrpm"></div> <div class="banner_728"> <div align="center" id="richmedia_4"><iframe scrolling="no" frameborder="0" allowtransparency="true" height="90" width="728" style="border:0;" src="cid:frame-C089B61F8496D28CBAEDC7BB7DBC11A3@mhtml.blink"></iframe></div> </div> <div class="layout_clear"></div> <div class="content_title">Scraping API</div> <div class="content_text" style="padding: 10px 0 0 0"> <div id="notice">You can scrape our website, but your IP will most likely get blocked to prevent unnecessary load on our servers. We therefore offer this scraping API service for people who want to scrape our platform without getting blocked.</div> </div> <div class="content_text"> This is the Pastebin scraping API documentation page. Here you can find all the information you need to get started with our scraping API. If you have questions, feel free to <a href="https://pastebin.com/contact">contact us</a>. <div style="margin: 8px 0 0 30px"> 1. <a href="https://pastebin.com/doc_scraping_api#1">Your Whitelisted IP</a><br> 2. <a href="https://pastebin.com/doc_scraping_api#2">Request Limits</a><br> 3. <a href="https://pastebin.com/doc_scraping_api#3">Recommended Scraping Logic</a><br> 4. <a href="https://pastebin.com/doc_scraping_api#4">Request Most Recent Pastes</a><br> 5. <a href="https://pastebin.com/doc_scraping_api#5">Request RAW Paste Data</a><br> 6. <a href="https://pastebin.com/doc_scraping_api#6">Request Paste Metadata</a> </div> </div> <a name="1"></a> <div class="content_sub_title">Your Account & Whitelisted IP</div> <div class="content_text"> Our scraping API is only available for LIFETIME PRO members, and only for those who have their IP whitelisted.<br><br> <div id="error"><b>IMPORTANT:</b> TO WHITELIST YOUR IP, YOU NEED TO BE A LIFETIME PRO MEMBER! <a href="https://pastebin.com/pro">UPGRADE TO A LIFETIME PRO ACCOUNT</a>, THEN YOU CAN WHITELIST YOUR IP ON THIS PAGE!</div> Your account status is: <b><font color="red">FREE, NOT PRO</font></b><br> Your whitelisted IP is: <b><font color="red">NOT SET</font></b><br> Your current IP is: <b>197.156.77.204</b><br> <br> Only 1 IP can be whitelisted per Pastebin PRO account. You can update your IP as often as you like, and changes are effective immediately. Both IPv4 and IPv6 IP's are accepted. It depends on your network configuation which one you have to whitelist.<br><br> <b>Important:</b> Please make sure you ONLY fetch the scraping API endpoints listed on this page. If you scrape our website (including /raw/* pages) with your whitelisted IP, you will get blocked. </div> <a name="2"></a> <div class="content_sub_title">Request Limits</div> <div class="content_text"> Your whitelisted IP should not run into any issues as long as you don't abuse our service. We recommend not making more than 1 request per second, as there really is no need to do so. Going over 1 request per second won't get you blocked, but if we see excessive unnecessary scraping, we might take action. </div> <a name="3"></a> <div class="content_sub_title">Recommended Scraping Logic</div> <div class="content_text"> If you are trying to read ALL new public pastes, we recommend that you list 1x per minute the 100 most recent pastes. Store all those ID's/Keys locally somewhere, then fetch all those pastes and process the information however you like. <br><br> We urge you not to re-fetch pastes unnecessarily, as the data doesn't change quickly. Having some kind of local database system, which prevents unnecessary re-fetches is highly recommended! It lowers the load on both your own and our servers. </div> <a name="4"></a> <div class="content_sub_title">Request Most Recent Pastes</div> <div class="content_text"> <div id="notice">Due to caching, items listed here are shown with a 2 minute delay.</div> To fetch the most recent pastes, query the link below. It's a pretty standard JSON output. You can limit the results by adding <span class="mark">?limit=10</span> for example. The max allowed value there is <span class="mark">250</span>. Default is <span class="mark">50</span>. You can only reach this link with your whitelisted IP! <div class="code_box boostit">https://scrape.pastebin.com/api_scraping.php</div> Below is an example output: <div class="code_box" style="overflow:scroll">[ { "scrape_url": "https://scrape.pastebin.com/api_scrape_item.php?i=0CeaNm8Y", "full_url": "https://pastebin.com/0CeaNm8Y", "date": "1442911802", "key": "0CeaNm8Y", "size": "890", "expire": "1442998159", "title": "Once we all know when we goto function", "syntax": "java", "user": "admin" }, { "scrape_url": "https://scrape.pastebin.com/api_scrape_item.php?i=8sUIsf34", "full_url": "https://pastebin.com/8sUIsf34", "date": "1442911665", "key": "8sUIsf34", "size": "250", "expire": "0", "title": "master / development delete restriction", "syntax": "php", "user": "" } ] </div> You can also add <span class="mark">?lang=php</span> for example, if you just want to grab results from a certain language. We support well over 200 languages. <a href="https://pastebin.com/api#5">Click here</a> to find all the supported languages. Always include the value on the left hand side of that list to query it. </div> <a name="5"></a> <div class="content_sub_title">Request RAW Paste Data</div> <div class="content_text"> To fetch the RAW data of any paste, you can