Posts
Search Results
Site and Policy » Database dump survey » Post 138
xbi
@byte[]
I don’t know how properly handle zero values for geometric mean. Maybe just throw them out or add artificial bias (1? By why not 2 or 3 or 0.1?), but this seems to be quick dirty fix.
I rely more on quantiles or median values for making conclusions, and use average for visualisation just to make plot line look smoother ( I work with data for plots >>1664423 ).
Sometimes I use geometric mean implicitly, as average of logarithms of biased value, but I don’t need it here, because usual average gives acceptable result for visualisation.
I don’t know how properly handle zero values for geometric mean. Maybe just throw them out or add artificial bias (1? By why not 2 or 3 or 0.1?), but this seems to be quick dirty fix.
I rely more on quantiles or median values for making conclusions, and use average for visualisation just to make plot line look smoother ( I work with data for plots >>1664423 ).
Sometimes I use geometric mean implicitly, as average of logarithms of biased value, but I don’t need it here, because usual average gives acceptable result for visualisation.
Site and Policy » Database dump survey » Post 136
xbi
@byte[]
image_intensities table,
I don’t use avg_intensity from database, because it is (or close to) average of other intensities, so I didn’t exported it to avoid redundancy.
I distribute images to 100 buckets from 0% to 99% based on this luminosity and print median or average from images of each bucket. Median works not very good for plot visualization, because it is small integer about 20-30 and gives too sharp “steps” on plot. But it behaves same way as average with extremums at same places.
image_intensities table,
((se+sw+ne+nw)/4 / 85.425 * 100)
I don’t use avg_intensity from database, because it is (or close to) average of other intensities, so I didn’t exported it to avoid redundancy.
I distribute images to 100 buckets from 0% to 99% based on this luminosity and print median or average from images of each bucket. Median works not very good for plot visualization, because it is small integer about 20-30 and gives too sharp “steps” on plot. But it behaves same way as average with extremums at same places.
Site and Policy » Database dump survey » Post 134
xbi
I have made diagram which maps whole image luminocity to number of image faves with this luminocity
Sorry, no quantiles/confindence intervals (anyway, they are not very reliable for real world non-so-random variables) :/ https://imgur.com/a/LSMzirE
There are smooth maximums at 30% and 80%, can’t determine reason. Also there is the sharp peak a 0%
Peak at luminocity = 0%, these are mostly artist:brist-sta sketches with pure black RGB channels shaped by alpha - id:575309 OR id:452762 OR id:1214368 OR id:450548 OR id:797007 OR id:518545 OR id:454628 OR id:93637 OR id:777566 OR id:751581 OR id:682715 OR id:656416 OR id:818833 OR id:1044815 OR id:519373 OR id:1315209 OR id:269536 OR id:247799 OR id:248686 OR id:1529097
This artist alone is the reason of peak of faves/upvotes for luminocity=0
Sorry, no quantiles/confindence intervals (anyway, they are not very reliable for real world non-so-random variables) :/ https://imgur.com/a/LSMzirE
There are smooth maximums at 30% and 80%, can’t determine reason. Also there is the sharp peak a 0%
Peak at luminocity = 0%, these are mostly artist:brist-sta sketches with pure black RGB channels shaped by alpha - id:575309 OR id:452762 OR id:1214368 OR id:450548 OR id:797007 OR id:518545 OR id:454628 OR id:93637 OR id:777566 OR id:751581 OR id:682715 OR id:656416 OR id:818833 OR id:1044815 OR id:519373 OR id:1315209 OR id:269536 OR id:247799 OR id:248686 OR id:1529097
This artist alone is the reason of peak of faves/upvotes for luminocity=0
Site and Policy » Database dump survey » Post 133
xbi
Something less trivial. While all derpibooru arts average brightness is peaked at 99%-100%, top rated art average brightness is peaked at 80%. Also top half is slightly brighter than bottom. Actually this is not insight for artists, this is well-know rule that image top should have more space and lighter than bottom. Sorry, I can’t today make very clear illustrations, but
I have sacrificed non-safe images, because their fave count are less about brightness distribution and more about image plot, other “quality” thresholds for fave count and they are lesser part of images overall. Brightness is taken from image_intensities database table, value 85.425 is renormalized to 100%, top brightness is as ne+nw average, bttom brightness is taken as se+sw average
I have sacrificed non-safe images, because their fave count are less about brightness distribution and more about image plot, other “quality” thresholds for fave count and they are lesser part of images overall. Brightness is taken from image_intensities database table, value 85.425 is renormalized to 100%, top brightness is as ne+nw average, bttom brightness is taken as se+sw average
Site and Policy » Database dump survey » Post 130
xbi
Also I see, that pure white area is written as ‘85.425’ intensity. There are thousands of images with pure white and 85.425 intensity in some quarter of the image, but literally zero images with intensities above 85.425. Cant see why code https://gist.github.com/liamwhite/b023cdba4738e911293a8c610b98f987 is limited by this value.
Site and Policy » Database dump survey » Post 128
xbi
Some research report, nothing insightful, but maybe you like to hear that your data was used for some fun :)
Greatest image reverse search number of results is returned for pure black image.
because:
Greatest image reverse search number of results is returned for pure black image.
because:
- videos intensities often written as zeroes (not sure, is this some default for non-images or black first frame but I am not curios here)
- there are a lot images with almost black RGB channel, shaped by alpha-channel
If I filter images by only jpg/png type, the most popular image intensities are 84% (all nw/ne/sw/se), this is mostly sketches on white canvas, random examples:
id:887130 OR id:1199605 OR id:710207 OR id:1973369 OR id:1342558 OR id:980837 OR id:1657132 OR id:1579694 OR id:1511092 OR id:845906 OR id:590728 OR id:2043172 OR id:879879 OR id:1311237 OR id:1289942 OR id:113926 OR id:833858 OR id:1747729 OR id:1907604 OR id:938497 OR id:1905000 OR id:1137379 OR id:1436988 OR id:156902 OR id:928870 OR id:1665760 OR id:1782928 OR id:1894505 OR id:1948949
You can use any of these sketches ( https://derpicdn.net/img/view/2016/7/13/1199605.png for example ) for reverse search and get large number of results even for 0.01 fuzziness
Image intensities popularity was calculated after rounding nw/ne/sw/se intensties [0.0 - 100.0] range to integers and printing the greatest count in 100x100x100x100 array
Site and Policy » Database dump survey » Post 121
Damaged
Word Bug
@Background Pony #67B5
Like I said, it’s just one of the handful of hard-deleted images.
Your app needs to catch when
Personally, I use (Python):
And then catch when an error is thrown.
Like I said, it’s just one of the handful of hard-deleted images.
Your app needs to catch when
application/json
is not in the returned Content-Type
header.Personally, I use (Python):
assert 'application/json' in headers['content-type'].lower()
And then catch when an error is thrown.
Site and Policy » Database dump survey » Post 120
Background Pony #2ECF
@Damaged
Thanks for the suggestion.
@byte[]
Sorry, it should have been https://derpibooru.org/1206149.json I forgot to change the number.
Thanks for the suggestion.
@byte[]
Sorry, it should have been https://derpibooru.org/1206149.json I forgot to change the number.
Site and Policy » Database dump survey » Post 119
Damaged
The first one,
Word Bug
@Damaged
I had a problem with https://derpibooru.org/1206148.json or https://derpibooru.org/17.json does the same thing.
The first one,
1206148
, works just fine for me. The second one is one of the hard-deleted images I mentioned. I’d suggest coding to allow for not getting JSON. This is particularly important when the site goes into DDOS protection mode, as you will NEVER get JSON when that is active.Site and Policy » Database dump survey » Post 118
xbi
@Damaged
@byte[]
It could be more comfortable to have some sign of unavailable image, so when I fail to show the thumbnail - i can start searching for bug in my code or fixing my internet connection. Also it is useful to avoid them in statistics, they could be some type of hard to debug outliers and hard to take decisions with my own eyes about image content.
@byte[]
It could be more comfortable to have some sign of unavailable image, so when I fail to show the thumbnail - i can start searching for bug in my code or fixing my internet connection. Also it is useful to avoid them in statistics, they could be some type of hard to debug outliers and hard to take decisions with my own eyes about image content.
Site and Policy » Database dump survey » Post 117
Background Pony #2ECF
@Damaged
I had a problem with EDIT:https://derpibooru.org/1206148.json https://derpibooru.org/1206149.json or https://derpibooru.org/17.json does the same thing.
@byte[]
Both of the above return 302 and then go to https://derpibooru.org/
I had a problem with EDIT:
@byte[]
Both of the above return 302 and then go to https://derpibooru.org/
Site and Policy » Database dump survey » Post 116
xbi
@Background Pony #67B5
I see only 565 rows are missing from the database in the range from 0 to the last id. Of course there were much more deleted images due legal reasons or artist requests. Also derpibooru org behaves in very different way for images which are deleted from database and for hidden images. For example, >>411160 (deleted) is hidden and shows reason of deletion, but has the database entry. And >>17 or >>964900 are really deleted from the database, and derpibooru site redirects to the main page. It is same behavior for my user point of view, I can’t write simple code which skips these deleted images thumbnails in html generation without going online.
I see only 565 rows are missing from the database in the range from 0 to the last id. Of course there were much more deleted images due legal reasons or artist requests. Also derpibooru org behaves in very different way for images which are deleted from database and for hidden images. For example, >>411160 (deleted) is hidden and shows reason of deletion, but has the database entry. And >>17 or >>964900 are really deleted from the database, and derpibooru site redirects to the main page. It is same behavior for my user point of view, I can’t write simple code which skips these deleted images thumbnails in html generation without going online.
Site and Policy » Database dump survey » Post 115
Damaged
There are some hard-deleted entries. These are mostly old____+ ones.
I have noticed no such problems with the API. What endpoint are you having issues with?
Word Bug
@xbi
Funny you ask. There seem to be no rows in the database that contain data about deleted images (not even the data that were available through API).
Take for example:
select * from images as i where i.id between 16 and 18;
which returns only rows 2 for 16 and 18 and I know for sure >>17 is deleted.
Or:
select * from images as i where i.id in (17, 238, 249, 280);
which returned no rows for me.
There are some hard-deleted entries. These are mostly old____+ ones.
@byte[]
You guys making changes to the API is fine and dandy. However: CAN YOU AT LEAST LET PEOPLE KNOW WHEN YOU ROLL OUT THE FUCKING CHANGES?! Make a stupid mailing list or something FFS.
I have noticed no such problems with the API. What endpoint are you having issues with?
Showing results 1 - 25 of 140 total
Default search
If you do not specify a field to search over, the search engine will search for posts with a body that is similar to the query's word stems. For example, posts containing the words winged humanization
, wings
, and spread wings
would all be found by a search for wing
, but sewing
would not be.
Allowed fields
Field Selector | Type | Description | Example |
---|---|---|---|
author | Literal | Matches the author of this post. Anonymous authors will never match this term. | author:Joey |
body | Full Text | Matches the body of this post. This is the default field. | body:test |
created_at | Date/Time Range | Matches the creation time of this post. | created_at:2015 |
id | Numeric Range | Matches the numeric surrogate key for this post. | id:1000000 |
my | Meta | my:posts matches posts you have posted if you are signed in. | my:posts |
subject | Full Text | Matches the title of the topic. | subject:time wasting thread |
topic_id | Literal | Matches the numeric surrogate key for the topic this post belongs to. | topic_id:7000 |
topic_position | Numeric Range | Matches the offset from the beginning of the topic of this post. Positions begin at 0. | topic_position:0 |
updated_at | Date/Time Range | Matches the creation or last edit time of this post. | updated_at.gte:2 weeks ago |
user_id | Literal | Matches posts with the specified user_id. Anonymous users will never match this term. | user_id:211190 |
forum | Literal | Matches the short name for the forum this post belongs to. | forum:meta |