Wanted: Chinese Name Gender Reference

[Update: Found it! See this.]

I sometimes get emails from Chinese people that are friends of friends, or someone I’ve never actually met, and I’m not really sure if it’s a boy or girl. There was no way to know ahead of time because it was always “tā” when my friend mentioned “tā,” and I never thought to ask. So then I get an email from someone like, 张安平 or 李娟 and I don’t the gender of the sender (it’s a mind-bender!). I don’t want to be an offender, so this is something I’d like a computer to render (ok, I’ll stop).

I know it’s not the MOST important tool in the world, but still, here’s what I want: I enter a Chinese given name (míngzi 名字) and it tells me whether it’s most likely a boy’s name or a girl’s name.

I know there are some that are ambiguous, but I get the feeling that Chinese people can tell from reading someone’s hanzi name if that person is most likely a boy or a girl. For example the inclusion of “flower” (huā ) is a dead giveaway that it’s a girl, and “dragon” (lóng ) is reserved only for boys, right? Well where’s the resources that lists all those sorts of rules of thumb?

Questions for you all (if anyone’s still reading):

1. Does something like this exist? If so, where? If not, don’t you think that’s unfair since English learners can easily find out the gender of most English names?

2. If this doesn’t exist, would it be possible to create (given the technical savvy)? What would the problems be?

Comments

  1. Interesting question. I’d guess that is a woman’s name based on the radical, but that’s purely an uninformed guess.

    In Japanese name genders are pretty obvious, with the first clue being that if it ends with then it’s a girl. Names ending with , , , etc. are boys. Of course there are a lot more patterns out there, but I can only think of one gender-neutral name off the top of my head: Mizuki (for which there are many possible kanji combinations).

  2. I think this would be great, and it could have other applications as well, such as helping laowai who want to choose a suitable Chinese name, but don’t want to accidentially choose one of an inappropriate gender. Also, it could help us choose a name that is common and culturally acceptable. I still haven’t picked one out, partially because I’m afraid of ending up like so many of my students with weird English names, like King and Stone and Baby and Yoyo and Dragon.

  3. If you give me a list of names + genders (Male, Female, Either), I’ll look at making this.

    Problems:

    Names are often multi-character. This means that there will be a rather large set of names to be added to the database. Some characters could be ignored – for example.

    Each character could be given a “gender” weighting, and the result could be the sum of the gender weights. More negative = more likely that it is female, more positive more likely that it is male.

    Some characters could be triggers for gender, like and .

    Need:

    List of all characters used in names.
    Values for each character.

  4. Is there a possibility of exploiting Google for this? If you search for “张安平他” you get 123,000 hits, while with “张安平她” it’s zero. For “李娟她” it’s 1,330, while with “李娟他” it’s 708. We can deduce that 张安平 is male and 李娟 is female. I think that’s right, right?

    I’m banking on most of the combinations of name + pronoun to be something like “as for person X, he/she…”, which is obviously not 100% true. So it’s not a foolproof scheme. But I suspect it would be a good first approximation, which you could supplement with component analysis and other cleverness as amake has suggested.

  5. Great idea Corpus!

    I’m still going to try to make the db thingy, but to be honest I think your idea is sufficient for most people.

    My wife tells me that my idea of giving “gender weightings” to each character won’t work – sometimes parents give their children one character names because they sound pretty when combined with the surname.

    I found a massive list of Chinese names (500,000+), awaiting permission from the creator to use them.

  6. Cojak.org has pages for Male Given Names, Female Given Names, and Ambigender Given Names. You could look up the name on each page and see from the bar underneath each character the likelihood the name is male or female. For example, you’ll find that appears only on the “Female Given Names” page and it has a large bar underneath.

Leave a Reply

Your email address will not be published. Required fields are marked *