Commit graph

13 commits

Author SHA1 Message Date
Claire
9c1d3086af Fix inefficiencies in auto-linking code (#16506)
The auto-linking code basically rewrote the whole string escaping non-ascii
characters in an inefficient way, and building a full character offset map
between the unescaped and escaped texts before sending the contents to
TwitterText's extractor.

Instead of doing that, this commit changes the TwitterText regexps to include
valid IRI characters in addition to valid URI characters.
2021-07-15 15:56:58 +02:00
Claire
a33f8f787a Update twitter-text from 1.14 to 3.1.0 and fix toot character counting (#15382)
* Update twitter-text from 1.14 to 3.1.0

* Disable emoji parsing

* Properly depend on twitter-text for url detection

* Fix some URLs being wrongly detected client-side

* Add test for server-side validation of non-autolinkable URLs

* Fix server-side status length counting
2021-03-02 12:02:56 +01:00
Shubhendra Singh Chauhan
daea469385 Fixed code quality issues (#15541)
* Added .deepsource.toml

* Removed bad use of `alias`

* Fixed operand order in the binary expression

* Prefixed unused method arguments with an underscore

* Replaced the old OpenSSL algorithmic constants with the newer strings initializers.

* Removed unnecessary UTF-8 encoding comment
2021-01-31 21:26:09 +01:00
Josh Leeb-du Toit
c94a083b9a Add support for Gemini urls (#15013)
This PR updates the `valid_url` regex and sanitizer allowlist to provide
support for Gemini urls.

Closes #14991
2020-10-19 17:02:13 +02:00
ThibG
25e7aa913c Add support for magnet: URIs (#12905) 2020-01-23 21:27:26 +01:00
ThibG
b7927b397d Add support for linking XMPP URIs in toots (#12709)
* Fix wrong grouping in Twitter valid_url regex

* Add support for xmpp URIs

Fixes #9776

The difficult part is autolinking, because Twitter-text's extractor does
some pretty ad-hoc stuff to find things that “look like” URLs, and XMPP
URIs do not really match the assumptions of that lib, so it doesn't sound
wise to try to shoehorn it into the existing regex.

This is why I used a specific regex (very close, although slightly more
permissive than the RFC), and a specific scan function (a simplified version
of the generalized one from Twitter).

* Remove leading “xmpp:” from auto-linked text
2020-01-11 02:15:25 +01:00
Eugen Rochko
5adc3f5676 Fix URL linkifier grabbing full-width spaces and quotations (#9997)
Fix #9993
Fix #5654
2019-02-09 20:13:11 +01:00
aus-social
a53bcb6213 Lint pass (#8876) 2018-10-04 12:36:53 +02:00
luzpaz
1bce70d3c7 Misc. typos (#8694)
Found via `codespell -q 3 --skip="./app/javascript/mastodon/locales,./config/locales"`
2018-09-14 00:53:09 +02:00
Eugen Rochko
f8bc071e03 Add dat, dweb, ipfs, ipns, ssb, gopher protocols to URL extractor (#7810)
* Add dat:// and gopher:// to URL extractor

Fix #6072

* Fix comment indent

* Add dweb, ipfs, ipns, ssb
2018-06-15 20:21:47 +02:00
Daniel King
845ea13622 Fix URLs incorrectly having trailing hyphen removed (#6465)
In cases where a URL has a trailing hyphen the FetchLinkCardService incorrectly removes the hyphen when it is parsed

The hyphen is not a reserved character in the URI spec https://tools.ietf.org/html/rfc3986#section-2.2
2018-02-11 23:49:18 +01:00
unarist
938a99c89d Re-allow underscore on valid_url_path_ending_chars (#4999)
Limiting allowed characters in the last character of the URL is came from twitter-text, but underscore is allowed on there, and Mastodon before #4941.
2017-09-18 21:25:40 +02:00
ふぁぼ原
c71727ca55 Enable to recognize most kinds of characters as URL paths (#4941) 2017-09-14 18:03:20 +02:00