(no subject)

Sunday, 2 March 2025 11:44
pangolin20: A picture of a griffon vulture. (Vulture)
[personal profile] pangolin20
Below the cut, there is a Python program which builds an index of the front page or day view of a Dreamwidth blog. I've used it to great success to keep an index of one of the communities I'm part of.

import requests
def getLinks(slug):

url1= f'htt‌ps://[journal/community name goes here].dreamwidth.org{slug}'
content = requests.get(url1).content
content_split = str(content).split("\\n")

url_title = ""
url_cut = ""
for i in range(len(content_split)):

if 'a title' in content_split[i]:
f = content_split[i]
title_line = f.split("\">")
if 'lj:user' and 'Sticky' not in title_line[0]:
f_2 = title_line[1].split("=\"")
real_href = f_2[2]
title = f_2[1].split("\" h")
url_title += "<a href=\"" + real_href + "\">" + title[0]
if 'cutid' in content_split[i]:
g = content_split[i].split("cutid")
for z in range(len(g)):
if z > 0:
h = g[z].split("</a>")
index = h[0].split(">")
if 'Read more...' != index[1]:
url_cut += " [" + index[1] + "]"
if 'Read more...' == index[1]:
url_cut = ""

if 'footer' in content_split[i]:
if len(url_title) > 0:
print(url_title + url_cut + "</a>")
url_cut = ""
url_title = ""

getLinks('/')

Profile

pangolin20: An image of a pangolin. (Default)
Scales

August 2025

M T W T F S S
    123
45678910
11121314151617
18192021 222324
25262728293031
Page generated Sunday, 12 April 2026 11:40

Expand Cut Tags

No cut tags