MarseyWorld/files/routes/search.py

309 lines
9.1 KiB
Python
Raw Normal View History

2022-05-04 23:09:46 +00:00
import re
[DO NOT MERGE] import detanglation (#442) * move Base definition to files.classes.__init__.py * fix ImportError * move userpage listing to users.py * don't import the app from classes * consts: set default values to avoid crashes consts: warn if the secret key is the default config value * card view: sneed (user db schema) * cloudflare: use DEFAULT_CONFIG_VALUE * const: set default values * decouple media.py from __main__ * pass database to avoid imports * import cleanup and import request not in const, but in the requests mega import * move asset_submissions site check to __init__ * asset submissions feature flag * flag * g.is_tor * don't import request where it's not needed * i think this is fine * mail: move to own routes and helper * wrappers * required wrappers move * unfuck wrappers a bit * move snappy quotes and marseys to stateful consts * marsify * :pepodrool: * fix missing import * import cache * ...and settings.py * and static.py * static needs cache * route * lmao all of the jinja shit was in feeds.py amazing * classes should only import what they need from flask * import Response * hdjbjdhbhjf * ... * dfdfdfdf * make get a non-required import * isort imports (mostly) * but actually * configs * reload config on import * fgfgfgfg * config * config * initialize snappy and test * cookie of doom debug * edfjnkf * xikscdfd * debug config * set session cookie domain, i think this fixes the can't login bug * sdfbgnhvfdsghbnjfbdvvfghnn * hrsfxgf * dump the entire config on a request * kyskyskyskyskyskyskyskyskys * duifhdskfjdfd * dfdfdfdfdfdfdfdfdfdfdfdf * dfdfdfdf * imoprt all of the consts beacuse fuck it * 😭 * dfdfdfdfdfdfsdasdf * print the entire session * rffdfdfjkfksj * fgbhffh * not the secret keys * minor bug fixes * be helpful in the warning * gfgfgfg * move warning lower * isort main imports (i hope this doesn't fuck something up) * test * session cookie domain redux * dfdfdfd * try only importing Flask * formkeys fix * y * :pepodrool: * route helper * remove before flight * dfdfdfdfdf * isort classes * isort helpers * move check_for_alts to routehelpers and also sort imports and get rid of unused ones * that previous commit but actkally * readd the cache in a dozen places they were implicitly imported * use g.is_tor instead of request.headers. bla bla bla * upgrade streamers to their own route file * get rid of unused imports in __main__ * fgfgf * don't pull in the entire ORM where we don't need it * features * explicit imports for the get helper * explicit imports for the get helper redux * testing allroutes * remove unused import * decouple flask from classes * syntax fix also remember these have side fx for some reason (why?) * move side effects out of the class * posts * testing on devrama * settings * reloading * settingssdsdsds * streamer features * site settings * testing settings on devrama * import * fix modlog * remove debug stuff * revert commit 67275b21ab6e2f2520819e84d10bfc1c746a15b6 * archiveorg to _archiveorg * skhudkfkjfd * fix cron for PCM * fix bugs that snekky wants me to * Fix call to realbody passing db, standardize kwarg * test * import check_for_alts from the right place * cloudflare * testing on devrama * fix cron i think * shadow properly * tasks * Remove print which will surely be annoying in prod. * v and create new session * use files.classes * make errors import little and fix rare 500 in /allow_nsfw * Revert "use files.classes" This reverts commit 98c10b876cf86ce058b7fb955cf1ec0bfb9996c6. * pass v to media functions rather than using g * fix * dfdfdfdfd * cleanup, py type checking is dumb so don't use it where it causes issues * Fix some merge bugs, add DEFAULT_RATELIMIT to main. * Fix imports on sqlalchemy expressions. * `from random import random` is an error. * Fix replies db param. * errors: fix missing import * fix rare 500: only send to GIFT_NOTIF_ID if it exists, and send them the right text * Fix signup formkey. * fix 2 500s * propagate db to submissions * fix replies * dfdfdfdf * Fix verifiedcolor. * is_manual * can't use getters outside of an app context * don't attempt to do gumroad on sites where it's not enabled * don't attempt to do gumraod on sites's where it's unnecessary * Revert "don't attempt to do gumroad on sites where it's not enabled" This reverts commit 6f8a6331878655492dfaf1907b27f8be513c14d3. * fix 500 * validate media type Co-authored-by: TLSM <duolsm@outlook.com>
2022-11-15 09:19:08 +00:00
import time
from calendar import timegm
2022-05-04 23:09:46 +00:00
from sqlalchemy import *
[DO NOT MERGE] import detanglation (#442) * move Base definition to files.classes.__init__.py * fix ImportError * move userpage listing to users.py * don't import the app from classes * consts: set default values to avoid crashes consts: warn if the secret key is the default config value * card view: sneed (user db schema) * cloudflare: use DEFAULT_CONFIG_VALUE * const: set default values * decouple media.py from __main__ * pass database to avoid imports * import cleanup and import request not in const, but in the requests mega import * move asset_submissions site check to __init__ * asset submissions feature flag * flag * g.is_tor * don't import request where it's not needed * i think this is fine * mail: move to own routes and helper * wrappers * required wrappers move * unfuck wrappers a bit * move snappy quotes and marseys to stateful consts * marsify * :pepodrool: * fix missing import * import cache * ...and settings.py * and static.py * static needs cache * route * lmao all of the jinja shit was in feeds.py amazing * classes should only import what they need from flask * import Response * hdjbjdhbhjf * ... * dfdfdfdf * make get a non-required import * isort imports (mostly) * but actually * configs * reload config on import * fgfgfgfg * config * config * initialize snappy and test * cookie of doom debug * edfjnkf * xikscdfd * debug config * set session cookie domain, i think this fixes the can't login bug * sdfbgnhvfdsghbnjfbdvvfghnn * hrsfxgf * dump the entire config on a request * kyskyskyskyskyskyskyskyskys * duifhdskfjdfd * dfdfdfdfdfdfdfdfdfdfdfdf * dfdfdfdf * imoprt all of the consts beacuse fuck it * 😭 * dfdfdfdfdfdfsdasdf * print the entire session * rffdfdfjkfksj * fgbhffh * not the secret keys * minor bug fixes * be helpful in the warning * gfgfgfg * move warning lower * isort main imports (i hope this doesn't fuck something up) * test * session cookie domain redux * dfdfdfd * try only importing Flask * formkeys fix * y * :pepodrool: * route helper * remove before flight * dfdfdfdfdf * isort classes * isort helpers * move check_for_alts to routehelpers and also sort imports and get rid of unused ones * that previous commit but actkally * readd the cache in a dozen places they were implicitly imported * use g.is_tor instead of request.headers. bla bla bla * upgrade streamers to their own route file * get rid of unused imports in __main__ * fgfgf * don't pull in the entire ORM where we don't need it * features * explicit imports for the get helper * explicit imports for the get helper redux * testing allroutes * remove unused import * decouple flask from classes * syntax fix also remember these have side fx for some reason (why?) * move side effects out of the class * posts * testing on devrama * settings * reloading * settingssdsdsds * streamer features * site settings * testing settings on devrama * import * fix modlog * remove debug stuff * revert commit 67275b21ab6e2f2520819e84d10bfc1c746a15b6 * archiveorg to _archiveorg * skhudkfkjfd * fix cron for PCM * fix bugs that snekky wants me to * Fix call to realbody passing db, standardize kwarg * test * import check_for_alts from the right place * cloudflare * testing on devrama * fix cron i think * shadow properly * tasks * Remove print which will surely be annoying in prod. * v and create new session * use files.classes * make errors import little and fix rare 500 in /allow_nsfw * Revert "use files.classes" This reverts commit 98c10b876cf86ce058b7fb955cf1ec0bfb9996c6. * pass v to media functions rather than using g * fix * dfdfdfdfd * cleanup, py type checking is dumb so don't use it where it causes issues * Fix some merge bugs, add DEFAULT_RATELIMIT to main. * Fix imports on sqlalchemy expressions. * `from random import random` is an error. * Fix replies db param. * errors: fix missing import * fix rare 500: only send to GIFT_NOTIF_ID if it exists, and send them the right text * Fix signup formkey. * fix 2 500s * propagate db to submissions * fix replies * dfdfdfdf * Fix verifiedcolor. * is_manual * can't use getters outside of an app context * don't attempt to do gumroad on sites where it's not enabled * don't attempt to do gumraod on sites's where it's unnecessary * Revert "don't attempt to do gumroad on sites where it's not enabled" This reverts commit 6f8a6331878655492dfaf1907b27f8be513c14d3. * fix 500 * validate media type Co-authored-by: TLSM <duolsm@outlook.com>
2022-11-15 09:19:08 +00:00
from files.helpers.regex import *
2022-07-09 10:32:49 +00:00
from files.helpers.sorting_and_time import *
[DO NOT MERGE] import detanglation (#442) * move Base definition to files.classes.__init__.py * fix ImportError * move userpage listing to users.py * don't import the app from classes * consts: set default values to avoid crashes consts: warn if the secret key is the default config value * card view: sneed (user db schema) * cloudflare: use DEFAULT_CONFIG_VALUE * const: set default values * decouple media.py from __main__ * pass database to avoid imports * import cleanup and import request not in const, but in the requests mega import * move asset_submissions site check to __init__ * asset submissions feature flag * flag * g.is_tor * don't import request where it's not needed * i think this is fine * mail: move to own routes and helper * wrappers * required wrappers move * unfuck wrappers a bit * move snappy quotes and marseys to stateful consts * marsify * :pepodrool: * fix missing import * import cache * ...and settings.py * and static.py * static needs cache * route * lmao all of the jinja shit was in feeds.py amazing * classes should only import what they need from flask * import Response * hdjbjdhbhjf * ... * dfdfdfdf * make get a non-required import * isort imports (mostly) * but actually * configs * reload config on import * fgfgfgfg * config * config * initialize snappy and test * cookie of doom debug * edfjnkf * xikscdfd * debug config * set session cookie domain, i think this fixes the can't login bug * sdfbgnhvfdsghbnjfbdvvfghnn * hrsfxgf * dump the entire config on a request * kyskyskyskyskyskyskyskyskys * duifhdskfjdfd * dfdfdfdfdfdfdfdfdfdfdfdf * dfdfdfdf * imoprt all of the consts beacuse fuck it * 😭 * dfdfdfdfdfdfsdasdf * print the entire session * rffdfdfjkfksj * fgbhffh * not the secret keys * minor bug fixes * be helpful in the warning * gfgfgfg * move warning lower * isort main imports (i hope this doesn't fuck something up) * test * session cookie domain redux * dfdfdfd * try only importing Flask * formkeys fix * y * :pepodrool: * route helper * remove before flight * dfdfdfdfdf * isort classes * isort helpers * move check_for_alts to routehelpers and also sort imports and get rid of unused ones * that previous commit but actkally * readd the cache in a dozen places they were implicitly imported * use g.is_tor instead of request.headers. bla bla bla * upgrade streamers to their own route file * get rid of unused imports in __main__ * fgfgf * don't pull in the entire ORM where we don't need it * features * explicit imports for the get helper * explicit imports for the get helper redux * testing allroutes * remove unused import * decouple flask from classes * syntax fix also remember these have side fx for some reason (why?) * move side effects out of the class * posts * testing on devrama * settings * reloading * settingssdsdsds * streamer features * site settings * testing settings on devrama * import * fix modlog * remove debug stuff * revert commit 67275b21ab6e2f2520819e84d10bfc1c746a15b6 * archiveorg to _archiveorg * skhudkfkjfd * fix cron for PCM * fix bugs that snekky wants me to * Fix call to realbody passing db, standardize kwarg * test * import check_for_alts from the right place * cloudflare * testing on devrama * fix cron i think * shadow properly * tasks * Remove print which will surely be annoying in prod. * v and create new session * use files.classes * make errors import little and fix rare 500 in /allow_nsfw * Revert "use files.classes" This reverts commit 98c10b876cf86ce058b7fb955cf1ec0bfb9996c6. * pass v to media functions rather than using g * fix * dfdfdfdfd * cleanup, py type checking is dumb so don't use it where it causes issues * Fix some merge bugs, add DEFAULT_RATELIMIT to main. * Fix imports on sqlalchemy expressions. * `from random import random` is an error. * Fix replies db param. * errors: fix missing import * fix rare 500: only send to GIFT_NOTIF_ID if it exists, and send them the right text * Fix signup formkey. * fix 2 500s * propagate db to submissions * fix replies * dfdfdfdf * Fix verifiedcolor. * is_manual * can't use getters outside of an app context * don't attempt to do gumroad on sites where it's not enabled * don't attempt to do gumraod on sites's where it's unnecessary * Revert "don't attempt to do gumroad on sites where it's not enabled" This reverts commit 6f8a6331878655492dfaf1907b27f8be513c14d3. * fix 500 * validate media type Co-authored-by: TLSM <duolsm@outlook.com>
2022-11-15 09:19:08 +00:00
from files.routes.wrappers import *
from files.__main__ import app
2022-05-04 23:09:46 +00:00
search_operator_hole = HOLE_NAME
2022-06-22 06:35:50 +00:00
valid_params = [
2022-05-04 23:09:46 +00:00
'author',
'domain',
2022-06-22 06:35:50 +00:00
'over18',
2022-10-02 08:55:39 +00:00
'post',
'before',
'after',
'title',
2022-11-07 00:22:06 +00:00
'cc',
search_operator_hole,
2022-05-04 23:09:46 +00:00
]
def searchparse(text):
text = text.lower()
2022-05-04 23:09:46 +00:00
criteria = {x[0]:x[1] for x in query_regex.findall(text)}
for x in criteria:
if x in valid_params:
text = text.replace(f"{x}:{criteria[x]}", "")
text = text.strip()
2022-05-04 23:09:46 +00:00
if text:
criteria['full_text'] = text
criteria['q'] = []
2022-07-06 11:49:13 +00:00
for m in search_token_regex.finditer(text):
token = m[1] if m[1] else m[2]
# Escape SQL pattern matching special characters
token = token.replace('\\', '').replace('_', '\_').replace('%', '\%')
criteria['q'].append(token)
2022-05-04 23:09:46 +00:00
return criteria
@app.get("/search/posts")
@auth_required
def searchposts(v):
query = request.values.get("q", '').strip()
try: page = max(1, int(request.values.get("page", 1)))
except: abort(400, "Invalid page input!")
2022-05-04 23:09:46 +00:00
sort = request.values.get("sort", "new").lower()
t = request.values.get('t', 'all').lower()
criteria=searchparse(query)
posts = g.db.query(Submission.id) \
.join(Submission.author) \
.filter(Submission.author_id.notin_(v.userblocks))
2022-05-04 23:09:46 +00:00
if not v.paid_dues:
posts = posts.filter(Submission.club == False)
2022-05-04 23:09:46 +00:00
2022-10-06 05:37:50 +00:00
if v.admin_level < PERMS['POST_COMMENT_MODERATION']:
posts = posts.filter(
Submission.deleted_utc == 0,
Submission.is_banned == False,
2022-10-06 05:37:50 +00:00
Submission.private == False)
2022-05-04 23:09:46 +00:00
if 'author' in criteria:
posts = posts.filter(Submission.ghost == False)
author = get_user(criteria['author'], v=v, include_shadowbanned=False)
if not author.is_visible_to(v):
if v.client:
abort(403, f"@{author.username}'s profile is private; You can't use the 'author' syntax on them")
2022-05-04 23:09:46 +00:00
return render_template("search.html",
v=v,
query=query,
total=0,
page=page,
listing=[],
sort=sort,
t=t,
next_exists=False,
domain=None,
domain_obj=None,
error=f"@{author.username}'s profile is private; You can't use the 'author' syntax on them."
2022-10-30 07:33:42 +00:00
), 403
2022-05-04 23:09:46 +00:00
else: posts = posts.filter(Submission.author_id == author.id)
2022-10-02 08:55:39 +00:00
if 'q' in criteria:
if('title' in criteria):
words = [or_(Submission.title.ilike('%'+x+'%')) \
for x in criteria['q']]
else:
words = [or_(Submission.title.ilike('%'+x+'%'), Submission.body.ilike('%'+x+'%')) \
for x in criteria['q']]
posts = posts.filter(*words)
2022-05-04 23:09:46 +00:00
if 'over18' in criteria: posts = posts.filter(Submission.over_18==True)
if 'domain' in criteria:
domain=criteria['domain']
domain = domain.replace('\\', '').replace('_', '\_').replace('%', '').strip()
posts=posts.filter(
or_(
Submission.url.ilike("https://"+domain+'/%'),
Submission.url.ilike("https://"+domain+'/%'),
Submission.url.ilike("https://"+domain),
Submission.url.ilike("https://"+domain),
Submission.url.ilike("https://www."+domain+'/%'),
Submission.url.ilike("https://www."+domain+'/%'),
Submission.url.ilike("https://www."+domain),
Submission.url.ilike("https://www."+domain),
Submission.url.ilike("https://old." + domain + '/%'),
Submission.url.ilike("https://old." + domain + '/%'),
Submission.url.ilike("https://old." + domain),
Submission.url.ilike("https://old." + domain)
)
)
if search_operator_hole in criteria:
posts = posts.filter(Submission.sub == criteria[search_operator_hole])
2022-05-04 23:09:46 +00:00
if 'after' in criteria:
after = criteria['after']
try: after = int(after)
except:
try: after = timegm(time.strptime(after, "%Y-%m-%d"))
except: abort(400)
posts = posts.filter(Submission.created_utc > after)
if 'before' in criteria:
before = criteria['before']
try: before = int(before)
except:
try: before = timegm(time.strptime(before, "%Y-%m-%d"))
except: abort(400)
posts = posts.filter(Submission.created_utc < before)
2022-11-07 00:22:06 +00:00
if 'cc' in criteria:
cc = criteria['cc'].lower().strip()
if cc == 'true': cc = True
elif cc == 'false': cc = False
else: abort(400)
posts = posts.filter(Submission.club == cc)
2022-07-09 10:32:49 +00:00
posts = apply_time_filter(t, posts, Submission)
2022-05-04 23:09:46 +00:00
2022-10-12 08:05:26 +00:00
posts = sort_objects(sort, posts, Submission,
2022-10-13 04:10:34 +00:00
include_shadowbanned=(v and v.can_see_shadowbanned))
2022-05-04 23:09:46 +00:00
total = posts.count()
posts = posts.offset(PAGE_SIZE * (page - 1)).limit(PAGE_SIZE+1).all()
2022-05-04 23:09:46 +00:00
ids = [x[0] for x in posts]
next_exists = (len(ids) > PAGE_SIZE)
ids = ids[:PAGE_SIZE]
2022-05-04 23:09:46 +00:00
posts = get_posts(ids, v=v, eager=True)
2022-05-04 23:09:46 +00:00
2022-11-15 09:28:39 +00:00
if v.client: return {"total":total, "data":[x.json(g.db) for x in posts]}
2022-05-04 23:09:46 +00:00
return render_template("search.html",
2022-09-04 23:15:37 +00:00
v=v,
query=query,
total=total,
page=page,
listing=posts,
sort=sort,
t=t,
next_exists=next_exists
)
2022-05-04 23:09:46 +00:00
@app.get("/search/comments")
@auth_required
def searchcomments(v):
query = request.values.get("q", '').strip()
try: page = max(1, int(request.values.get("page", 1)))
except: abort(400, "Invalid page input!")
2022-05-04 23:09:46 +00:00
sort = request.values.get("sort", "new").lower()
t = request.values.get('t', 'all').lower()
criteria = searchparse(query)
comments = g.db.query(Comment.id).join(Comment.post) \
.filter(Comment.parent_submission != None, Comment.author_id.notin_(v.userblocks))
2022-05-04 23:09:46 +00:00
if 'post' in criteria:
try: post = int(criteria['post'])
2022-10-15 10:30:13 +00:00
except: abort(404)
comments = comments.filter(Comment.parent_submission == post)
2022-05-04 23:09:46 +00:00
if 'author' in criteria:
comments = comments.filter(Comment.ghost == False)
author = get_user(criteria['author'], v=v, include_shadowbanned=False)
if not author.is_visible_to(v):
if v.client:
abort(403, f"@{author.username}'s profile is private; You can't use the 'author' syntax on them")
2022-05-04 23:09:46 +00:00
2022-10-30 07:33:42 +00:00
return render_template("search_comments.html", v=v, query=query, total=0, page=page, comments=[], sort=sort, t=t, next_exists=False, error=f"@{author.username}'s profile is private; You can't use the 'author' syntax on them."), 403
2022-05-04 23:09:46 +00:00
else: comments = comments.filter(Comment.author_id == author.id)
2022-10-02 08:55:39 +00:00
if 'q' in criteria:
tokens = map(lambda x: re.sub(r'[\0():|&*!<>]', '', x), criteria['q'])
tokens = filter(lambda x: len(x) > 0, tokens)
tokens = map(lambda x: re.sub(r'\s+', ' <-> ', x), tokens)
comments = comments.filter(Comment.body_ts.match(
' & '.join(tokens),
postgresql_regconfig='english'))
2022-05-04 23:09:46 +00:00
if 'over18' in criteria: comments = comments.filter(Comment.over_18 == True)
if search_operator_hole in criteria:
comments = comments.filter(Submission.sub == criteria[search_operator_hole])
2022-07-09 10:32:49 +00:00
comments = apply_time_filter(t, comments, Comment)
2022-05-04 23:09:46 +00:00
2022-10-06 06:45:27 +00:00
if v.admin_level < PERMS['POST_COMMENT_MODERATION']:
2022-05-04 23:09:46 +00:00
private = [x[0] for x in g.db.query(Submission.id).filter(Submission.private == True).all()]
2022-05-24 20:43:49 +00:00
comments = comments.filter(Comment.is_banned==False, Comment.deleted_utc == 0, Comment.parent_submission.notin_(private))
2022-05-04 23:09:46 +00:00
if not v.paid_dues:
club = [x[0] for x in g.db.query(Submission.id).filter(Submission.club == True).all()]
comments = comments.filter(Comment.parent_submission.notin_(club))
if 'after' in criteria:
after = criteria['after']
try: after = int(after)
except:
try: after = timegm(time.strptime(after, "%Y-%m-%d"))
except: abort(400)
comments = comments.filter(Comment.created_utc > after)
if 'before' in criteria:
before = criteria['before']
try: before = int(before)
except:
try: before = timegm(time.strptime(before, "%Y-%m-%d"))
except: abort(400)
comments = comments.filter(Comment.created_utc < before)
2022-05-04 23:09:46 +00:00
2022-10-12 08:05:26 +00:00
comments = sort_objects(sort, comments, Comment,
2022-10-13 04:10:34 +00:00
include_shadowbanned=(v and v.can_see_shadowbanned))
2022-05-04 23:09:46 +00:00
total = comments.count()
comments = comments.offset(PAGE_SIZE * (page - 1)).limit(PAGE_SIZE+1).all()
2022-05-04 23:09:46 +00:00
ids = [x[0] for x in comments]
next_exists = (len(ids) > PAGE_SIZE)
ids = ids[:PAGE_SIZE]
2022-05-04 23:09:46 +00:00
comments = get_comments(ids, v=v)
if v.client: return {"total":total, "data":[x.json for x in comments]}
2022-05-04 23:09:46 +00:00
return render_template("search_comments.html", v=v, query=query, total=total, page=page, comments=comments, sort=sort, t=t, next_exists=next_exists, standalone=True)
@app.get("/search/users")
@auth_required
def searchusers(v):
query = request.values.get("q", '').strip()
try: page = max(1, int(request.values.get("page", 1)))
except: abort(400, "Invalid page input!")
2022-05-04 23:09:46 +00:00
sort = request.values.get("sort", "new").lower()
t = request.values.get('t', 'all').lower()
term=query.lstrip('@')
term = term.replace('\\','').replace('_','\_').replace('%','')
users=g.db.query(User).filter(
or_(
User.username.ilike(f'%{term}%'),
User.original_username.ilike(f'%{term}%')
)
)
2022-05-04 23:09:46 +00:00
2022-10-06 02:24:37 +00:00
if v.admin_level < PERMS['USER_SHADOWBAN']:
users = users.filter(User.shadowbanned == None)
2022-05-04 23:09:46 +00:00
users=users.order_by(User.username.ilike(term).desc(), User.stored_subscriber_count.desc())
total=users.count()
users = users.offset(PAGE_SIZE * (page-1)).limit(PAGE_SIZE+1).all()
next_exists=(len(users)>PAGE_SIZE)
users=users[:PAGE_SIZE]
2022-05-04 23:09:46 +00:00
if v.client: return {"data": [x.json for x in users]}
return render_template("search_users.html", v=v, query=query, total=total, page=page, users=users, sort=sort, t=t, next_exists=next_exists)