216
submitted 8 months ago* (last edited 8 months ago) by someguy3@lemmy.world to c/nostupidquestions@lemmy.world

Do they just speak faster? Do the Indian words/pronunciation flow better/faster than English does? And they are simply trying to match the cadence?

you are viewing a single comment's thread
view the rest of the comments
[-] merc@sh.itjust.works 311 points 8 months ago* (last edited 8 months ago)

One way of classifying languages is grouping them into stress-timed, syllable-timed and "mora"-timed languages.

Stress timed languages (like English) are ones where the time between stressed syllables is roughly the same. Take the phrase "I went to the store with my friend John". Most native English speakers will stress "went", "store", "friend" and "John". It might not be a big difference, but you'll notice the "to the" between "went" and "store" is rushed, and that there's a sort of gap between "friend" and "John" since both are stressed. (Also, if you were to modify that slightly and say "I went to the store with my friend named John", the time between "friend" and "John" wouldn't change much at all, you'd just slip "named" into that gap.)

Many Romance languages are seen as syllable-timed, where each syllable takes the same amount of time. In French that phrase is "Je suis allé au magasin avec mon ami John", that's 14 syllables, all roughly the same timing. In Spanish it's "Fui a la tienda con mi amigo John", 12 syllables. Unless you're really drawing attention to one of the words, every syllable there gets roughly the same timing.

Japanese is mora timed, which is pretty similar to being syllable timed, except that when you encounter double-letters they double the length of the syllable. So, "Just a moment please" is "Chottomatte kudasai", where the syllables with double-t letters take twice as long. The cities Tōkyō (two syllables), Ōsaka (three syllables) and Kawasaki (four syllables) all take the same amount of time to say because the "ō" symbol means that letter gets double the length of the standard "o".

The 4 most widely spoken languages in India are Hindi (way out in front with 44% of the population speaking it as a first language), followed by Bengali, Marathi and Telugu (with about 6-8% each) The first 3 are all Indo-Aryan languages, and Telugu is a Dravidian language. The 3 Indo-Aryan languages are considered to be syllable-timed and Telugu is considered to be mora-timed.

IMO, what makes Indian-inflected English seem fast is that they're adopting the syllable / mora timing from their primary language and using it in English. That means they spend less time on syllables / words that English speakers would stress and more time on the un-stressed syllables. The overall timing of what they say is probably similar, but in evening out the length of the syllables, they take time away from the syllables that other English speakers naturally slow down to stress. Since you tend to notice the stressed words more, since they're rushed it seems like the entire sentence is rushed.

[-] threeduck@aussie.zone 25 points 8 months ago* (last edited 8 months ago)
[-] Dragster39@feddit.de 23 points 8 months ago

Thank you, that was a good and interesting start of the day

[-] Black_Gulaman@lemmy.dbzer0.com 19 points 8 months ago

This is a fantastic explanation. Thank you.

[-] tigeruppercut@lemmy.zip 7 points 8 months ago

I remember seeing a linguist doing research into the actual timing of long Japanese vowels and finding that they weren't actually double the length, more like 1.5 times as long (or 1.7 or something like that). I'll have to see if I can find the article or paper again.

[-] merc@sh.itjust.works 5 points 8 months ago

Yeah, that makes sense. It seems hard to lengthen a vowel out like that unless you're actually chanting or something and are keeping to a specific rhythm.

[-] ilinamorato@lemmy.world 4 points 8 months ago

Ok, so I heard a thing a long time ago about information density in languages, and that there's a specific amount of information conveyed per second which is pretty consistent across languages, even when the number of sounds is higher or lower. Which means that a single word in English, for instance, would convey more information than a single word in Hindi.

Is there anything to that? Or was that just nonsense?

[-] merc@sh.itjust.works 11 points 8 months ago

Someone posted a link to just that topic here. Apparently almost all languages transmit about 39 bits per second of data. Italians use 9 syllables per second, Germans only about 5-6, but both convey the same amount of information per second. But, not all syllables are equal. Japanese has about 5 bits per syllable, English has about 7 bits per syllable. The most information dense language per syllable is apparently Vietnamese with about 8 bits per syllable.

Apparently though, the bottleneck is the brain. The end result seems to be that languages that have fewer "bits of data" per syllable say those syllables more quickly, and the ones with fewer bits of data per syllable say those syllables more slowly, so that the average is about 39 bits per second no matter what the language.

Having said that, I often listen to podcasts sped up to 1.5x speed, and I listen to podcasts while doing other things, so I guess the bottleneck is probably on the sending side rather than the receiving side.

[-] takeheart@lemmy.world 3 points 8 months ago

Podcasts, being prerecorded and edited, don't really fit this model. It's more for a conversation with a back and forth where both interlocutors don't know ahead of time what the other person will say. So they need to observe/listen, reflect while also coming up with answers and putting effort into being properly understood. So basically the natural context in which inter human communication evolved.

[-] ytg@feddit.ch 1 points 8 months ago

Does anyone know how the amount of information is actually derived? The article just says “researchers calculated”

[-] merc@sh.itjust.works 1 points 8 months ago

They were vague about it, but they said something about converting it to computer code. I would guess they just wrote it out as ASCII text and counted how many bits of ASCII equivalent they transmitted. (Of course this ignores intonation and emphasis, but I'd guess they did ignore those.)

[-] bleistift2@feddit.de 1 points 7 months ago

If that’s really what they did, it’s stupid. First, you need to find a translation for every language to ASCII, which will wildly skew the results. Second, there are many ways to express the same concept, which all vary wildly in length. Take “Hi”, 2 letters, which means exactly the same as “How are you doing?”, 14 letters.

[-] merc@sh.itjust.works 1 points 7 months ago

Take “Hi”, 2 letters, which means exactly the same as “How are you doing?”, 14 letters.

It's similar, but not exactly the same by any stretch. But, yeah, it's not a perfect method. But, there probably isn't a perfect method. How would you decide what "1 unit of information" is?

[-] bleistift2@feddit.de 1 points 7 months ago

How would you decide what “1 unit of information” is?

I wouldn’t, because I have no knowledge in the field. But since the paper hinges upon that exact definition, and “They were vague about it”, this raises the biggest red flag I’ve seen in science yet.

[-] actual_patience@programming.dev 3 points 8 months ago

Ok, so I heard a thing a long time ago about information density in languages, and that there's a specific amount of information conveyed per second which is pretty consistent across languages, even when the number of sounds is higher or lower.

This is true.

Which means that a single word in English, for instance, would convey more information than a single word in Hindi.

I don't think that's the right interpretation. There are words in English that would require sentences to be made for each if conveyed in a different language. But the same is true vice-versa.

Have a look at subtitles for movies from one language to any other. Translators struggle conveying what should be paragraph long sentences of context behind a single word for one language. Do not get me started on double speak.

[-] ilinamorato@lemmy.world 2 points 8 months ago

Oh, interesting. I hadn't considered that there would be variances in information density within a language, but that makes sense; "truth" is a very loaded concept that means a lot of different things in context, even though it's only one syllable; but on the other hand "authenticity" is five syllables but carries with it a meaning that is a subset of the definition of "truth."

I guess that's why subtitling is even possible in different languages; if there were languages with vastly less information density than the source language, they'd need a whole screen just for the captions.

this post was submitted on 08 Mar 2024
216 points (100.0% liked)

No Stupid Questions

35866 readers
1701 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago
MODERATORS