Formerly when we were missing an image in the standard emoji sequence
we always labeled it as 'missing'. However, android excludes some
region flags, so we expect those to be missing and labeling them
implies an error. Add a new category 'exclude' and populate it
using the unknown-flag alias keys.
Standard Unicode (emoji v5) does not define skin tones for wrestlers,
but android does. Alias the skin tone variants of the non-gender-
specific emoji to the corresponding male emoji, as we do for the
non-skin-tone version.
When images are not processed due to not being able to meet the
minimum quality setting, error 99 is returned, so catch this too
so we still copy the file.
Might still want to tweak these settings.
When imagemagick 6.7.7-10 is processing the '-extent' operator and
discovers an image is grayscale, it turns the 32-bit truecolor sRGB
image into a grayscale image, but does so incorrectly-- the gray
levels and alpha are wrong. Get around this by using composition
to copy the source image over a slightly larger transparent image.
Thumbnail generation for Unicode requires some changes:
- 72x72 images (exact, not just fit within that frame)
- custom prefix ending in underscore
- images for unsupported flags
The default build doesn't support some flags by default, since they
are not wanted by Android. For the thumbnail list these images need
to be provided, so we alias them to the unknown flag images as that's
what would show for them. It's probably a good idea to list these
explicitly anyway, other tools could use this information.
To generate the Unicode comparison page of various vendor emoji,
Unicode prefers to use 72x72 images for all the supported emoji
without aliasing. This tool will generate these from the
directory of cleaned images produced by the emoji build, using
the aliases defined in emoji_aliases.txt.
Previously we haven't put the binary into the repo itself, since it's
built using the tooling. But people who fetch fonts from the get/noto
website want to know more about the version history of the fonts they
find there. Checking the binary into the noto-emoji site will facilitate
this.
This was built locally from the public images using the standard makefile
and zopflipng.
When relying on aliasing, a number of single character emoji can be
replaced by sequence emoji (in particular, gendered variants). If
these images aren't present, the current code that displays a sequence
'visually' fails to find an image for one of the parts, so bails and
there's no visual presentation for those sequences.
To fix this, we first canonicalize the part we're looking for, and try
to find an image for that, and if we fail we check for an alias and
try to find an image for that.
Forgot to canonicalize the aliases, so most of them wouldn't get used
because the keys against which they're compared are canonical. Fixed
that.
Also report unused aliases.
- Support --ignore_missing flag to skip missing data on output.
When all_images is set, this skips sequences for which we have
no image files. When all_images is not set, this skips sequences
for which we have image files but are not in the canonical
sequence list (e.g. older sequences for which we included skin
tone variants but which later versions of unicode decided there
shouldn't be).
- Use alias information to add alias sequences when not using
all_images and we have an image for the target sequence.
- Use alias information to mark missing images with '-alias-' when
we expect an alias (note, not only when we actually have one)
- Embed tool name, date, and arguments in a comment in the generated
html.
We currently name the mixed-gender 'kiss' and 'couple with heart'
images after the single-codepoint sequences. But aliasing maps
the single codepoint sequence to the gendered sequence, not the
reverse. As a result the build doesn't create ligatures for the
gendered sequences, since it thinks the image doesn't exist.
Fix this to use the gendered-sequence-names for these images, and
let aliasing work as intended. This follows the convention we've
adopted of letting the name more completely describe the image
contents, and defining how to represent less-specific sequences
using aliasing rather than baking these decisions into each image
name.
We've been inconsistent about use of the variation selector in image names,
and it's cleaner if we just consistently drop it. We use the unicode data
for the full unicode strings for these names now so we don't need it in
the image data.
Formerly the annotations file created a set of sequences that would
cause the name field to display with a special background color. This
lets you choose one of three colors by defining the 'type' of annotation
in the file. The file format was enhanced and the code using it takes
the type of annotation into account.
This also adds a sample annotation file with annotations for a number of
situations we currently expect to encounter: missing images that we expect
to be supported by aliases to other images, flags that we expect to not
support, and new unicode 10 emoji that we might not yet have image data
for.
By default, the list of emoji sequences is based on the union of
the sequences encoded in the image file names for all the directories
(or the first directory if --limit was set). The --all_emoji option
uses the emoji sequences from nototools/unicode_data instead.
By default, the list of emoji sequences is in unicode codepoint order.
The --emoji_sort option uses the emoji sequence sort order from
nototools/unicode_data instead.
Along with this, the ordered list of sequences becomes an argument to
write_html_page, which it should have been all along.
It's a bit cleaner to canonicalize the keys when we read the file names.
This means we can just use the one canonical key, instead of using
the original to get the file and the canonical one to render text and
show the decoding.
This is a rewrite of add_glyphs in third_party/color_emoji. The
primary motivation was to move special aliasing rules out of that
code and use an external aliases file instead. This new version
is a bit more thorough about aliasing, and hopefully a little
easier to read.
The new add_glyphs takes its parameters using keywords, so
the invocation in the Makefile changed (as well as the path to
the tool).
emoji_aliases.txt was extended to add the flag aliases that were
formerly defined in the old add_glyphs code.
add_aliases was modified so the name of the alias file could be
passed in as a parameter to the main utility function that reads
the alias mapping from the file.
The new code expects all glyphs used by the template GSUB tables
to be named in the GlyphOrder table, but doesn't require the cmap
and hmtx table to be fleshed out. The new code fleshes these out
when it processes the sequences to add. As a result the cmap and
hmtx tables in the template were truncated.
The new code also sorts the GlyphOrder table when it extends/rebuilds
it.
Since subregion flag sequences consist of BN and ON they can be
impacted by bidi, and once again we have the problem that these are
processed in visual order so we need GSUB rules such that we can
handle them in either direction. All subregion flag sequences
contain U+E007F, so we use that as a trigger for adding the
reversed sequence.
We also need to handle emitting the missing flag glyph for the
reversed sequences.
And we also want to strip out tag glyphs when the context is reversed.
This means the chaining context should include 'E007F' as well.