This adds an argument to generate_emoji_html that specifies
a file listing codepoint sequences. Emoji matching one of these
codepoint sequences are highlighted in the output.
When waveflag.c was forked from behdad's repo one of the changes that
was made effectively removed the ability to generate different sizes
of flags, despite leaving the SIZE value in the source. Recently we
needed to generate waved flags at a different power-of-two size, and
found it no longer functioned as the original. These changes restore
that while leaving the other changes in this forked version (mostly
formatting changes) intact.
It used to be difficult to find a sequence since the codepoints weren't
provided, just the images. This provides the codepoint list as
the 'name' of the sequence.
This also makes some other changes:
- the python template system doesn't like keyword names that have have
hyphens, so rename font-face-style to fontFaceStyle to get around this.
Thought this had been fixed earlier, but apparently it didn't end up in
a pushed commit.
- no longer insert emoji variation selector after some characters.
This was done to see what difference it made in browser behavior, but
we think now that browsers should be able to handle these sequences
without the selectors present.
- use a flag to pass name of output html file, rather than taking it as
a direct arg. other flags take multiple args, and if the html file
name comes after one of those, it gets swallowed by the other flag,
so it has to come first. This lets you put the file name anywhere
in the parameter list, the flag acts as a delimiter.
Images are shown in LTR and RTL contexts. Chrome currently doesn't
correctly render some emoji sequences, in part this is because it is
using Unicode 8 property data. At any rate, these are known Chrome
issues.
To handle forming emoji 'ligatures' in RTL contexts we generate
reversed ligature sequences for the GSUB table. Formerly we only did
this when there was a ZWJ in the sequence, and full reversal worked
because we had no sequences with both fitzpatrick modifiers and ZWJ.
However, now we do. Harfbuzz treats fitzpatrick modifiers as though
they were combining marks and so we need to as well so that GSUB data
is in the order Harfbuzz expects. So we 'unreverse' these pairs.
By default the tool uses all sequences that appear in any of the image
sets. To make it easier to see just the changes between a smaller
set of images and a large one, this lets you limit the sequences
to just those in the first set being compared.
Formerly, we wrote the file paths as given on the command line, the
assumption being that the output file was in the cwd and the paths
to the directories would be correct.
However if we want to generate the output file somewhere other than
the cwd the generated image paths don't work. This takes the location
of the output file into account and either generates relative paths
if the files are under the output file directory, or absolute paths
otherwise.
- materialize_emoji_images is a tool that adds symlinks to an existing set
of images to add aliases with names that match some of those that get
are built into the ligature table. This is for the convenience of folks
who want to review the images and see what sequences/codepoints we support.
I've been asked to do this enough that I might as well just build a tool
for it.
- flag_info picks out the flag images in a directory based on two kinds of
naming styles the data use (ASCII or emoji_u+codepoint) and presents them
in a list similar to that in the Makefile. It helps when tracking down
what flags we support and what we don't by making it easier to compare
sets of flag images with different naming. This is another quick one-off.
- Reformat lines to 80 columns.
- Use logging instead of verbose/quiet other options.
- A few miscellaneous small fixes/tweaks to parameters. Removed some
file-path-relative stuff that assumed old directory structure.
This uses some new fns in nototools.tool_utils, see nototools#220.
- Remove PUA character for 'unknown flag' from cmap.
Unfortunately, the contorted build process means we can't do this where
we do our other cmap munging-- font.getGlyphID dies in emoji_builder
if we remove it from the cmap in add_glyphs.py. So we remove it at
the end of emoji_builder.
- Forgot to remap one territory flag, missed it in the spreadsheet. Also
corrected a typo where I remapped the same flag twice. Sorted the flags
in key alpha order.
This adds some additional flags to the default set. In addition,
it contains code that creates ligatures for some flag sequences to
others, for a few cases where we want different regions to share
the same flag. Finally, it adds default ligatures so that pairs of
regional indicator characters for which there's no predefined glyph
get a 'missing flag' glyph. This avoids cases where sequences of
regional indicator sequences accidentally match at odd locations
because of a previous mismatch.
There is no actual 'missing flag' glyph yet. The code uses an
existing emoji as a placeholder.
This uses nototools to get unicode names. It relies on new
api in nototools.unicode_data to get data/names of proposed emoji
that are not currently approved and so not in the standard data
files.