Sie sind auf Seite 1von 3

10/12/2017 Get webpage contents with Python?

- Stack Overflow

Learn, Share, Build


Each month, over 50 million developers come to Stack Overflow to Google Facebook
learn, share their knowledge, and build their careers. OR

Join the worlds largest developer community.

Get webpage contents with Python?

I'm using Python 3.1, if that helps.

Anyways, I'm trying to get the contents of this webpage. I Googled for a little bit and tried different things, but they didn't work. I'm guessing that
this should be an easy task, but...I can't get it. :/.

Results of urllib, urllib2:

>>> import urllib2


Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
import urllib2
ImportError: No module named urllib2
>>> import urllib
>>> urllib.urlopen("http://www.python.org")
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
urllib.urlopen("http://www.python.org")
AttributeError: 'module' object has no attribute 'urlopen'
>>>

Python 3 solution

Thank you, Jason. :D.

import urllib.request
page = urllib.request.urlopen('http://hiscore.runescape.com/index_lite.ws?player=zezima')
print(page.read())

python python-3.x

edited Mar 14 '13 at 22:38 asked Dec 3 '09 at 22:25


idbrii Andrew
6,590 1 35 71 4,321 12 35 51

4 Duplicate: Search for urlib2 or get web page [python] in SO and you'll find 100's of similar questions.
S.Lott Dec 3 '09 at 22:26

Tried urllib2 and urllib, but neither worked. (Edited first post) Andrew Dec 3 '09 at 22:32

2 He's using Python 3, so the APIs are different. I surely learned something new by researching this answer.
Jason R. Coombs Dec 3 '09 at 22:39

@Andrew: It helps to check the questions and answers carefully to see if they say Python 3 or not. If they
don't say Python 3, they don't apply to you. S.Lott Dec 3 '09 at 22:40

1 For anyone looking for python 2, see stackoverflow.com/q/2289768/79125 (use urllib.urlopen) idbrii Mar
14 '13 at 22:40

6 Answers

Because you're using Python 3.1, you need to use the new Python 3.1 APIs.

Try:

urllib.request.urlopen('http://www.python.org/')

Alternately, it looks like you're working from Python 2 examples. Write it in Python 2, then use
the 2to3 tool to convert it. On Windows, 2to3.py is in \python31\tools\scripts. Can someone
else point out where to find 2to3.py on other platforms?

https://stackoverflow.com/questions/1843422/get-webpage-contents-with-python 1/3
10/12/2017 Get webpage contents with Python? - Stack Overflow
Edit

These days, I write Python 2 and 3 compatible code by using six.

from six.moves import urllib


urllib.request.urlopen('http://www.python.org')

Assuming you have six installed, that runs on both Python 2 and Python 3.

edited Apr 1 '15 at 20:20 answered Dec 3 '09 at 22:38


Jason R. Coombs
24.2k 6 51 63

I'm on Windows. Anyways, thanks, it worked fine. (The page you linked me to looks very helpful, by the way.
Thanks for that, especially.) Andrew Dec 3 '09 at 22:42

1 On Ubuntu, it was in the path, so I just had to run the 2to3 command. Whereis says it is at
/usr/bin/2to3 Azendale Dec 15 '12 at 18:29

2 Damn, python 3 is starting to become a problem: one can't just copy-paste the first stack overflow answer
and expect it to work anymore ! xApple Feb 1 '13 at 15:38

@xApple: The way I see it, Python 2 is starting to become a problem ;) Jason R. Coombs Apr 1 '15 at
20:20

The best way to do this these day is to use the 'requests' library:

import requests
response = requests.get('http://hiscore.runescape.com/index_lite.ws?player=zezima')
print (response.status_code)
print (response.content)

answered May 9 '14 at 13:02


Jonathan Hartley
8,188 8 59 68

Zezima foreva <3 White Shadow Nov 7 '16 at 6:29

If you ask me. try this one

import urllib2
resp = urllib2.urlopen('http://hiscore.runescape.com/index_lite.ws?player=zezima')

and read the normal way ie

page = resp.read()

Good luck though

edited Dec 12 '15 at 5:37 answered Nov 14 '13 at 9:02


Sumit Olu
603 10 26 1,079 13 21

Mechanize is a great package for "acting like a browser", if you want to handle cookie
state, etc.

http://wwwsearch.sourceforge.net/mechanize/

answered Dec 3 '09 at 22:56


Joe Koberg
14.6k 4 32 47

You can use urlib2 and parse the HTML yourself.

Or try Beautiful Soup to do some of the parsing for you.

answered Dec 3 '09 at 22:29


JasDev
548 5 12

Tried urllib2 and urllib, but neither worked. (Edited first post) Andrew Dec 3 '09 at 22:32

Andrew, others can help you better if you describe in detail what you tried and what error message(s) /

https://stackoverflow.com/questions/1843422/get-webpage-contents-with-python 2/3
10/12/2017 Get webpage contents with Python? - Stack Overflow
unexpected behaviour resulted. micahwittman Dec 3 '09 at 22:35

I edited it into my initial post because I didn't want a huge comment. :P. Andrew Dec 3 '09 at 22:37

A solution with works with Python 2.X and Python 3.X:

try:
# For Python 3.0 and later
from urllib.request import urlopen
except ImportError:
# Fall back to Python 2's urllib2
from urllib2 import urlopen

url = 'http://hiscore.runescape.com/index_lite.ws?player=zezima'
response = urlopen(url)
data = str(response.read())

answered Jul 18 '16 at 3:38


Martin Thoma
23.6k 32 184 357

https://stackoverflow.com/questions/1843422/get-webpage-contents-with-python 3/3

Das könnte Ihnen auch gefallen