Find All Urls In File
Okay, my problem is that my code only finds and prints the last url in the list, not all of the urls as i want. def convert(lst): return ' '.join(lst) with open('test.txt', '
Solution 1:
Your lines
is each line in the file. You want to do something like the following:
def convert(lst):
return' '.join(lst)
with open("test.txt", 'r') as f:
lines = f.read()
test = convert(lines)
urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*(),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', lines)
print(urls)
Solution 2:
There is absolutely no need to first read all the lines then join
them. Instead you can directly read all the data in the file using f.read()
in one step.
Try this:
withopen("test.txt", 'r') as f:
urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*(),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', f.read())
Now executing print(urls)
will produced the desired output.
Post a Comment for "Find All Urls In File"