Standardization of text markup

Angius

The Ponut Eater
Currently, markup used on the site is a weird mixture of Markdown, BBCode, and I don’t even know what else. That makes stuff like getting an image description from an API rather inconvenient:
 
[url=http://google.com]Google[/url] can easily be parsed with any BBcode library  
[Google](http://google.com) can be parsed even easier, since there’s even more Markdown libraries than there is BBcode ones  
<a href="http://google.com">Google</a> can, similarly, be parsed with any HTML parser, or even straight out displayed as intended
 
"Google":http://google.com however?
 
Just some food for thought, maybe somewhere down the line.
 
Edit: the last example given doesn’t even care about the order of the markup, shows a link instead of the code…
byte[]
Solar Supporter - Fought against the New Lunar Republic rebellion on the side of the Solar Deity (April Fools 2023).
Non-Fungible Trixie -
Verified Pegasus - Show us your gorgeous wings!
Preenhub - We all know what you were up to this evening~
An Artist Who Rocks - 100+ images under their artist tag
Artist -

Philomena Contributor
@Angius  
Yeah we’re working on moving it to BBCode. I have a reference parser and am waiting on another coder to see what he’s come up with so far.
Angius

The Ponut Eater
Out of curiosity, why BBcode and not Markdown? MD is more concise, requires less typing, and – I believe – there’s more parsers for it. BBC sounds kinda obsolete nowadays.
byte[]
Solar Supporter - Fought against the New Lunar Republic rebellion on the side of the Solar Deity (April Fools 2023).
Non-Fungible Trixie -
Verified Pegasus - Show us your gorgeous wings!
Preenhub - We all know what you were up to this evening~
An Artist Who Rocks - 100+ images under their artist tag
Artist -

Philomena Contributor
@Angius  
Ease of implementation, since BBCode parsers are almost trivial to write, and ease of adaptation, since the site is extremely restrictive about what’s allowed in markup.
Angius

The Ponut Eater
Far as parsers go, there’s plenty of MD parser gems, since – if I recall correctly – Derpibooru is written in ROR. Redcarpet for example. When it comes to restrictiveness, I would understand if MD allowed for inserting videos, outside JS code, stuff like that. But the only thing base MD has that isn’t already there is ordered and unordered lists. And code blocks.
 
Then again, I’d be grateful for any kind of standard, be it BBcode or Markdown.
byte[]
Solar Supporter - Fought against the New Lunar Republic rebellion on the side of the Solar Deity (April Fools 2023).
Non-Fungible Trixie -
Verified Pegasus - Show us your gorgeous wings!
Preenhub - We all know what you were up to this evening~
An Artist Who Rocks - 100+ images under their artist tag
Artist -

Philomena Contributor
@Angius  
Redcarpet isn’t an option I want to pursue because it uses a native extension to do parsing, and I’m trying to get JRuby support for the site going. Kramdown is an extremely complicated gem. While I do like the Markdown syntax, I find that it has far too many bad production rule pathologies and can make for a very ugly parser, so it’s not something I’d want to write my own of either.
 
This makes Markdown not something particularly in my favor to pick. Raw HTML, like the OP describes isn’t the worst thing to allow, but also not fantastic either. BBCode is kind of a subtle balance, since it’s easy enough to implement yourself in a few hundred lines and can’t be used to exploit typical HTML/XML parser bugs (thanks, libxml2).
 
The main vote in favor of BBCode for me is that it is reasonable to expect that documents are well-formed (that is, have a closing tag for every opening tag, and correctly nest tags). This isn’t something you can do with *asterisks* because you could e.g. write an asterisk-style* footnoote with them and break the expectations of the parser. Not so with BBCode; when you write an opening tag, you get precisely that tag, and you must close it. This makes parsing considerably easier and less likely to be vulnerable.
 
* like this
Angius

The Ponut Eater
Yep, I see your point.
 
Okay then, I’ll be waiting for the switch to BBC then, and in the meantime I’ll try to regex my way through the current mess…
barbeque
Roseluck - Had their OC in the 2023 Derpibooru Collab.
Elements of Harmony - Had an OC in the 2022 Community Collab
Non-Fungible Trixie -
Twinkling Balloon - Took part in the 2021 community collab.
Friendship, Art, and Magic (2018) - Celebrated Derpibooru's six year anniversary with friends.
Magical Inkwell - Wrote MLP fanfiction consisting of at least around 1.5k words, and has a verified link to the platform of their choice
Magnificent Metadata Maniac - #1 Assistant
Thread Starter - Tag alias request thread
Artist -
Bronze Bit -

@byte[]  
Since we’re moving to BBCode anyway, I don’t know what reference parsers say about this but are we going to enforce strictly correctly nested tags and just do whatever otherwise? In other words,  
\_some words \*yolo\_ more words\*  
Current behaviour: some words yolo more words  
Annoying (?) incorrectly-nested-but-eventually-closed tags behaviour: some words yolo more words  
(also, have fun converting that to BBCode. Hard mode is where  
some words  
*yolo
 
more words*  
are in separate lines/paragraphs)

 
It’s less relevant for Derpi than it is for, say, Fimfiction (because epub standards), but I happen to like correct nesting.
Interested in advertising on Derpibooru? Click here for information!
My Little Ties crafts shop

Help fund the $15 daily operational cost of Derpibooru - support us financially!

Syntax quick reference: **bold** *italic* ||hide text|| `code` __underline__ ~~strike~~ ^sup^ %sub%

Detailed syntax guide