· 6 years ago · Nov 28, 2019, 10:14 PM
1From: <Saved by Blink>
2Snapshot-Content-Location: https://pastebin.com/doc_scraping_api
3Subject: Pastebin.com - Scraping API
4Date: Mon, 7 Oct 2019 19:18:41 -0000
5MIME-Version: 1.0
6Content-Type: multipart/related;
7 type="text/html";
8 boundary="----MultipartBoundary--lqaWwnDhy388ydh4RcOJ2o7Ko3s9fk9NrLVaZF9wOA----"
9
10
11------MultipartBoundary--lqaWwnDhy388ydh4RcOJ2o7Ko3s9fk9NrLVaZF9wOA----
12Content-Type: text/html
13Content-ID: <frame-1AAB9E34C257E1D57FB5EC4A7FE85F4F@mhtml.blink>
14Content-Transfer-Encoding: binary
15Content-Location: https://pastebin.com/doc_scraping_api
16
17<!DOCTYPE html><html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
18 <!-- Global site tag (gtag.js) - Google Analytics -->
19
20
21
22
23
24 <title>Pastebin.com - Scraping API</title>
25 <link rel="shortcut icon" href="https://pastebin.com/favicon.ico">
26
27
28
29
30
31 <link href="https://pastebin.com/i/pastebin.min.v9.css" rel="stylesheet" type="text/css">
32 <!--[if lt IE 10]>
33 <link href="/i/pastebin.ie8.css" rel="stylesheet" type="text/css" />
34 <![endif]-->
35
36
37 <style>body{-webkit-text-size-adjust:none;}</style>
38 <meta property="fb:app_id" content="231493360234820">
39 <meta property="og:title" content="Pastebin.com - Scraping API">
40 <meta property="og:type" content="article">
41 <meta property="og:url" content="https://pastebin.com/doc_scraping_api">
42 <meta property="og:image" content="https://pastebin.com/i/facebook.png">
43 <meta property="og:site_name" content="Pastebin">
44 <meta name="google-site-verification" content="jkUAIOE8owUXu8UXIhRLB9oHJsWBfOgJbZzncqHoF4A">
45 <link rel="canonical" href="https://pastebin.com/doc_scraping_api">
46 <meta name="viewport" content="width=device-width, initial-scale=0.75, maximum-scale=1.0, user-scalable=yes">
47
48 <link rel="preload" href="https://adservice.google.com.et/adsid/integrator.js?domain=pastebin.com" as="script"><link rel="preload" href="https://adservice.google.com/adsid/integrator.js?domain=pastebin.com" as="script"><link rel="prefetch" href="https://tpc.googlesyndication.com/safeframe/1-0-35/html/container.html"></head>
49 <body>
50 <div id="main_frame">
51 <div id="jq-dropdown-1" class="jq-dropdown jq-dropdown-anchor-right jq-dropdown-scroll">
52 <ul class="jq-dropdown-menu">
53
54 <li class="lih_640">
55 <form class="search_form_li" name="search_form_li" method="get" action="https://pastebin.com/search" id="cse-search-box-li">
56 <input class="search_input_li" type="text" name="q" size="5" value="" placeholder="search...">
57 </form>
58 </li>
59 <li class="lih_div"></li>
60 <li class="dd_pa">My Pastebin</li>
61 <li class="dd_me">My Messages [1]</li>
62 <li class="dd_al">My Alerts</li>
63 <li class="dd_pr">Edit Profile</li>
64 <li class="dd_se">Edit Settings</li>
65 <li class="dd_pw">Change Password</li>
66
67 <li class="dd_lo">Log Out</li>
68 <li class="lih_div"></li>
69 <li class="lih_640">API</li>
70 <li class="lih_640">FAQ</li>
71 <li class="lih_640">Tools</li>
72 <li class="lih_640">Archive</li> </ul>
73 </div>
74 <div id="header">
75 <div id="header_wrap">
76 <div id="header_top">
77 <div id="header_logo">PASTEBIN</div>
78 <div id="header_new_paste" class="new_paste_button">new paste</div>
79 <div id="header_links">
80
81 <a href="https://pastebin.com/api" class="mmh">API</a>
82 <a href="https://pastebin.com/tools" class="mmh">tools</a>
83 <a href="https://pastebin.com/faq" class="mmh">faq</a>
84 <a href="https://deals.pastebin.com/" target="_blank" class="mmh">deals</a>
85 </div>
86 <div id="header_search">
87 <form class="search_form" name="search_form" method="get" action="https://pastebin.com/search" id="cse-search-box">
88 <input class="search_input" type="text" name="q" size="5" value="" placeholder="search...">
89 </form>
90 </div>
91
92 <div id="header_members">
93 <div id="header_dropdown" data-jq-dropdown="#jq-dropdown-1"> </div>
94 <div id="header_icon"><a href="https://pastebin.com/u/utubemax"><img src="https://pastebin.com/i/t.gif" class="header_icon" title="My Pastebin" alt=""></a></div>
95 <div id="header_user_frame">
96 <div id="header_username">utubemax</div>
97 <div id="header_user_status">FREE</div>
98 </div>
99 <div id="header_icons">
100 <a href="https://pastebin.com/u/utubemax" title="My Pastebin"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_mypastebin" alt=""></a>
101 <a href="https://pastebin.com/messages" title="My Messages"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_messages" alt=""></a>
102 <span class="msg_numb">1</span>
103 <a href="https://pastebin.com/alerts" title="My Alerts"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_alerts" alt=""></a>
104 <a href="https://pastebin.com/settings" title="My Settings"><img src="https://pastebin.com/i/t.gif" class="header_icons hi_settings" alt=""></a>
105 </div>
106 </div> </div>
107 </div>
108 </div>
109 <div id="super_frame">
110 <div id="monster_frame">
111 <div id="content_frame">
112 <div id="content_right">
113
114
115 <div class="content_right_menu">
116 <div class="content_right_title"><a href="https://pastebin.com/u/utubemax">My Pastes</a></div>
117 <div id="menu_1">
118 <ul class="right_menu"><li class="empty">Nothing here yet...</li></ul></div></div><div class="content_right_menu">
119 <div class="content_right_title"><a href="https://pastebin.com/archive">Public Pastes</a></div>
120 <div id="menu_2"><ul class="right_menu"><li><a href="https://pastebin.com/Kefey64C">Untitled</a><span>1 sec ago</span></li><li><a href="https://pastebin.com/h4fWMWkg">Untitled</a><span>6 sec ago</span></li><li><a href="https://pastebin.com/mHKt4Yq0">Untitled</a><span>7 sec ago</span></li><li><a href="https://pastebin.com/kpbyvDPE">Untitled</a><span>8 sec ago</span></li><li><a href="https://pastebin.com/nz8ZgWf3">Untitled</a><span>12 sec ago</span></li><li><a href="https://pastebin.com/jMX9BzC7">Untitled</a><span>16 sec ago</span></li><li><a href="https://pastebin.com/3u4mTEYN">Untitled</a><span>23 sec ago</span></li><li><a href="https://pastebin.com/buPzNr5i">Untitled</a><span>25 sec ago</span></li></ul></div></div> <div id="abrpm2"></div>
121
122 <div style="padding: 0; width:160px;margin: 10px 0;clear:left;">
123
124 <div align="center" id="richmedia_2"><iframe scrolling="no" frameborder="0" allowtransparency="true" height="600" width="160" style="border:0;" src="cid:frame-178FF1421FA83E298D33DFDD16EF6F08@mhtml.blink"></iframe></div>
125
126
127 </div>
128<div id="steadfast" title="Pastebin is proudly hosted by Steadfast.net"></div>
129 </div>
130 <div id="content_left"><div id="ie_msg"></div>
131
132
133 <div id="abrpm"></div>
134 <div class="banner_728">
135
136 <div align="center" id="richmedia_4"><iframe scrolling="no" frameborder="0" allowtransparency="true" height="90" width="728" style="border:0;" src="cid:frame-C089B61F8496D28CBAEDC7BB7DBC11A3@mhtml.blink"></iframe></div>
137
138 </div>
139 <div class="layout_clear"></div>
140 <div class="content_title">Scraping API</div>
141 <div class="content_text" style="padding: 10px 0 0 0">
142 <div id="notice">You can scrape our website, but your IP will most likely get blocked to prevent unnecessary load on our servers. We therefore offer this scraping API service for people who want to scrape our platform without getting blocked.</div>
143 </div>
144 <div class="content_text">
145 This is the Pastebin scraping API documentation page. Here you can find all the information you need to get started with our scraping API. If you have questions, feel free to <a href="https://pastebin.com/contact">contact us</a>.
146
147 <div style="margin: 8px 0 0 30px">
148 1. <a href="https://pastebin.com/doc_scraping_api#1">Your Whitelisted IP</a><br>
149 2. <a href="https://pastebin.com/doc_scraping_api#2">Request Limits</a><br>
150 3. <a href="https://pastebin.com/doc_scraping_api#3">Recommended Scraping Logic</a><br>
151 4. <a href="https://pastebin.com/doc_scraping_api#4">Request Most Recent Pastes</a><br>
152 5. <a href="https://pastebin.com/doc_scraping_api#5">Request RAW Paste Data</a><br>
153 6. <a href="https://pastebin.com/doc_scraping_api#6">Request Paste Metadata</a>
154 </div>
155 </div>
156 <a name="1"></a>
157 <div class="content_sub_title">Your Account & Whitelisted IP</div>
158 <div class="content_text">
159 Our scraping API is only available for LIFETIME PRO members, and only for those who have their IP whitelisted.<br><br>
160
161 <div id="error"><b>IMPORTANT:</b> TO WHITELIST YOUR IP, YOU NEED TO BE A LIFETIME PRO MEMBER! <a href="https://pastebin.com/pro">UPGRADE TO A LIFETIME PRO ACCOUNT</a>, THEN YOU CAN WHITELIST YOUR IP ON THIS PAGE!</div>
162
163 Your account status is: <b><font color="red">FREE, NOT PRO</font></b><br>
164 Your whitelisted IP is: <b><font color="red">NOT SET</font></b><br>
165 Your current IP is: <b>197.156.77.204</b><br> <br>
166 Only 1 IP can be whitelisted per Pastebin PRO account. You can update your IP as often as you like, and changes are effective immediately. Both IPv4 and IPv6 IP's are accepted. It depends on your network configuation which one you have to whitelist.<br><br>
167 <b>Important:</b> Please make sure you ONLY fetch the scraping API endpoints listed on this page. If you scrape our website (including /raw/* pages) with your whitelisted IP, you will get blocked.
168 </div>
169 <a name="2"></a>
170 <div class="content_sub_title">Request Limits</div>
171 <div class="content_text">
172 Your whitelisted IP should not run into any issues as long as you don't abuse our service. We recommend not making more than 1 request per second, as there really is no need to do so. Going over 1 request per second won't get you blocked, but if we see excessive unnecessary scraping, we might take action.
173 </div>
174 <a name="3"></a>
175 <div class="content_sub_title">Recommended Scraping Logic</div>
176 <div class="content_text">
177 If you are trying to read ALL new public pastes, we recommend that you list 1x per minute the 100 most recent pastes. Store all those ID's/Keys locally somewhere, then fetch all those pastes and process the information however you like.
178 <br><br>
179 We urge you not to re-fetch pastes unnecessarily, as the data doesn't change quickly. Having some kind of local database system, which prevents unnecessary re-fetches is highly recommended! It lowers the load on both your own and our servers.
180 </div>
181 <a name="4"></a>
182 <div class="content_sub_title">Request Most Recent Pastes</div>
183 <div class="content_text">
184 <div id="notice">Due to caching, items listed here are shown with a 2 minute delay.</div>
185 To fetch the most recent pastes, query the link below. It's a pretty standard JSON output. You can limit the results by adding <span class="mark">?limit=10</span> for example. The max allowed value there is <span class="mark">250</span>. Default is <span class="mark">50</span>. You can only reach this link with your whitelisted IP!
186 <div class="code_box boostit">https://scrape.pastebin.com/api_scraping.php</div>
187
188 Below is an example output:
189 <div class="code_box" style="overflow:scroll">[
190 {
191 "scrape_url": "https://scrape.pastebin.com/api_scrape_item.php?i=0CeaNm8Y",
192 "full_url": "https://pastebin.com/0CeaNm8Y",
193 "date": "1442911802",
194 "key": "0CeaNm8Y",
195 "size": "890",
196 "expire": "1442998159",
197 "title": "Once we all know when we goto function",
198 "syntax": "java",
199 "user": "admin"
200 },
201 {
202 "scrape_url": "https://scrape.pastebin.com/api_scrape_item.php?i=8sUIsf34",
203 "full_url": "https://pastebin.com/8sUIsf34",
204 "date": "1442911665",
205 "key": "8sUIsf34",
206 "size": "250",
207 "expire": "0",
208 "title": "master / development delete restriction",
209 "syntax": "php",
210 "user": ""
211 }
212] </div>
213 You can also add <span class="mark">?lang=php</span> for example, if you just want to grab results from a certain language. We support well over 200 languages. <a href="https://pastebin.com/api#5">Click here</a> to find all the supported languages. Always include the value on the left hand side of that list to query it.
214 </div>
215 <a name="5"></a>
216 <div class="content_sub_title">Request RAW Paste Data</div>
217 <div class="content_text">
218 To fetch the RAW data of any paste, you can