Use a tree‐based approach for advanced text formatting (#1907)
* Use a tree‐based approach for adv. text formatting Sanitizing HTML/Markdown means parsing the content into an HTML tree under‐the‐hood anyway, and it is more accurate to do mention/hashtag replacement on the text nodes in that tree than it is to try to hack it in with regexes et cetera. This undoes the overrides of `#entities` and `#rewrite` on `AdvancedTextFormatter` but also stops using them, instead keeping track of the parsed Nokogiri tree itself and using that in the `#to_s` method. Internally, this tree uses `<mastodon-entity>` nodes to keep track of hashtags, links, and mentions. Sanitization is moved to the beginning, so it should be known that these do not appear in the input. * Also disallow entities inside of `<code>` I think this is generally expected behaviour, and people are annoyed when their code gets turned into links/hashtags/mentions. * Minor cleanup to AdvancedTextFormatter * Change AdvancedTextFormatter to rewrite entities in one pass and sanitize at the end Also, minor refactoring to better match how other formatters are organized. * Add some tests Co-authored-by: Claire <claire.github-309c@sitedethib.com>
This commit is contained in:
@ -35,7 +35,7 @@ RSpec.describe AdvancedTextFormatter do
|
||||
end
|
||||
|
||||
context 'given a block code' do
|
||||
let(:text) { "test\n\n```\nint main(void) {\n return 0;\n}\n```\n" }
|
||||
let(:text) { "test\n\n```\nint main(void) {\n return 0; // https://joinmastodon.org/foo\n}\n```\n" }
|
||||
|
||||
it 'formats code using <pre> and <code>' do
|
||||
is_expected.to include '<pre><code>int main'
|
||||
@ -44,13 +44,17 @@ RSpec.describe AdvancedTextFormatter do
|
||||
it 'does not strip leading spaces' do
|
||||
is_expected.to include '> return 0'
|
||||
end
|
||||
|
||||
it 'does not format links' do
|
||||
is_expected.to include 'return 0; // https://joinmastodon.org/foo'
|
||||
end
|
||||
end
|
||||
|
||||
context 'given some quote' do
|
||||
let(:text) { "> foo\n\nbar" }
|
||||
context 'given a link in inline code using backticks' do
|
||||
let(:text) { 'test `https://foo.bar/bar` bar' }
|
||||
|
||||
it 'formats code using <code>' do
|
||||
is_expected.to include '<blockquote><p>foo</p></blockquote>'
|
||||
it 'does not rewrite the link' do
|
||||
is_expected.to include 'test <code>https://foo.bar/bar</code> bar'
|
||||
end
|
||||
end
|
||||
|
||||
|
Reference in New Issue
Block a user