Search
Items tagged with: linguistics
Hot off the press in the International Journal of Learner Corpus Research: "The more proficient the learners, the less sophisticated their L2 vocabulary? The curious effect of the reference corpus on mean-frequency measures of lexical sophistication" co-authored with @RaffaellaBottini! 🤗
#CorpusLinguistics #LearnerLanguage #linguistics #OpenAccess
Grandma used the word “whatsome” a lot. I've never heard anyone else say it. I often wonder where it came from.
Curiously, the Oxford Dictionary defines it as an obsolete #MiddleEnglish word meaning “whatever” that hasn't been used in over 500 years.
“Whatsome” was Grandma's “whatchamacallit”. She could also say “and whatsome” in the sense of “and so on”.
Incidentally, Oxford recognises “whatsomever” as a surviving #dialect word.
If you're a #language nerd like I am, then you won't have missed the @mozilla #CommonVoice v19 #speech #dataset release - which now features 131 languages! Here's my #dataviz, done in @observablehq of the v19 #metadata coverage.
I've updated the visualisation this time around with human-readable language names instead of their ISO-639 or BCP-47 language codes to make it it easier to read.
There's some interesting observations:
▶ Catalan (ca) continues to be leader in terms of data - speaking volumes about the efforts to revitalise culture and language in Catalunya. It's also one of the few languages that has data for all age groups, particularly older speakers - this sort of data is missing for most other languages.
▶ Kiswahili (sw) is one of the languages where there is more data for female-identifying speakers than for male-identifying speakers ♀ - although Japanese (ja), Western Mari (mrj) and Luganda (lg) do pretty well here, too!
▶ Sentence domains can now be categorised, and although most new sentences are "general", Albanian (sq) has a lot of sentences related to law and government.
▶ Tsonga (ts), a Bantu language spoken in Southern Africa, has dethroned Icelandic (is) as the language with the highest average utterance duration. I don't know enough about Tsonga to speculate why - it's a somewhat agglutinative language, but many Tsonga works are generally short.
▶ Bengali / Bangla (bn) has a significant amount of data that is not yet validated, and therefore does not appear in training / dev / test splits. There is a similar case for many languages new to Common Voice - it takes time to validate.
▶ The language with the highest number of average contributions per speaker is Taita (dav), a Bantu language from Kenya.
What do you make of the data visualisation? Are there any other insights you can see?
Big thanks to the CV team for all their efforts - EM, Jessica Rose, Dmitrij Feller and Justin Grant.
Can #typos make lies seem less deceptive or make true statements seem less true?
“true statements with grammatical errors and unusual word choices were seen as more deceitful, and lie statements with the same language were seen as less deceptive” and “I discovered a new brain response that is sensitive to the difference between perceived truths and lies.”
Dissertation permalink: digitalcommons.usu.edu/etd2023…
DOI: doi.org/10.26076/068f-415c
#ethics #xPhi #decisionScience #neuroscience #psychology #business #communication #linguistics
Online Deception: The Impact of Language in Text-Based Deception Detection
In today’s digital age, spreading false information online can have serious consequences, from affecting elections to undermining public health efforts.DigitalCommons@USU
Mistress, Miss, Mrs or Ms: untangling the shifting history of titles
In a paper published in the autumn 2014 issue of History Workshop Journal Dr Amy Erickson unravels the fascinating history of the titles used to address women.University of Cambridge
edit: still trying! please keep on boosting!
we're moving to #Prague! and we need some friends.
we're moving to #Czechia in September and we might be a bit lonely. so if you want to befriend a couple of quirky migrants, let me know!
I'm into #music, #FOSS, #linguistics, #DoctorWho, #TTRPG and #DropoutTV. oh, and I'm #trans.
my girlfriend likes #medicine, British television, #Disney, #Eurovision and #TrueCrime. "I am the bisexual stereotype".
#Praha #Prag #PleaseBoost
Languages: English, Finnish, French, German, Swedish.
Fully open access, no article processing fees. All areas of (general) #linguistics welcome.
journal.fi/finjol/about
I am super excited about this mini-conference on #reproducibility in #linguistics that I am organising this evening: Four of my M.A. students will be reporting on their attempts to reproduce the results of four published quantitative linguistics papers for which the data is available, but not the code!
Colleagues, they have *a lot* of things to report! So, if you're in the area (Cologne), do come along! There will be #ReproducibiliTea and Christmas biscuits! 🍵 🍪 #OpenScience
Tomorrow at noon EST, I will give a #talk on the #accessibility of #language #learning and #linguistics in general for #screenReader users as part of the a11yTalks event. This will be a public event with no need to register so if this is something any of you are interested in, here's the link :) a11ytalks.com/posts/2023-MAY/ #speaker #a11y
Do you speak Accessibility? - A look at accessibility hurdles for language learning and linguistics - A11yTalks
One would think that language has been solved in 2023. We have translation apps, sign languages, an international phonetic alphabet that is supposed to be able to represent any sound in any language for academics to endlessly discuss over.Do you speak Accessibility? - A look at accessibility hurdles for language learning and linguistics - A11yTalks
Vil du delta i et forskingsprosjekt om avbildende tegn i møte med andre mennesker?
Norsk tegnspråk er et eget språk. Når du møter en tegnspråklig i internasjonale settinger, bruker dere internasjonalt tegn og kroppsspråk for å forstå hverandre.Norges Døveforbund
Fascinating article!
"Unearthing a Long Ignored African Writing System, One Researcher Finds African History, by Africans: BU anthropologist Fallou Ngom discovered Ajami, a modified Arabic script, in a box of his late father’s old papers" posted December 21, 2022, written by Molly Callahan
bu.edu/articles/2022/fallou-ng…
Unearthing a Long Ignored African Writing System, One Researcher Finds African History, by Africans
A note in Ajami, a modified Arabic script, from Fallou Ngom's late father opened the door to a lifetime of discovery in African language and history.Molly Callahan (The Brink)
I’m writing an article that expands my microblog entry on stylometric fingerprinting to give more comprehensive advice. I am partially walking back on my recommendation not to use machine translation and adding information about reading levels, among other things. Would anybody familiar with #stylometry, or with a #linguistics background experienced with close-reading, be up for reviewing a rough draft next week?
I’d also be interested in how people may describe my own stylometric fingerprint (signature phrases, grammar quirks, etc), to use as an example.
Boosts appreciated.
Stylometric fingerprinting resistance
Following the recent SCOTUS ruling, many have been trying to publish resources to help people find reproductive healthcare. They often wish to do this anonymously, to avoid doxxing. There’s no shortage of guides on how to stay anonymous online.Seirdy's Home
🔵a #book lover
🔵mildly #burnedout but trying to take it easy 😊
🔵a #cat lover (sharing my place with 1 fluffy terrorist)
🔵#choleric (actively working on my tantrums though) 😂
🔵a good #communicator
🔵#compassionate
🔵#curious
🔵#Czech
🔵an #effectivealtruist (I translated Toby Ord's The Precipice into Czech)
🔵#empathic
🔵a #feminist
🔵#friendly
🔵a #linguistics nerd
🔵a #liberal
🔵a language #teacher
🔵a literary #translator
🔵very much in love with @stepan
🔵 a #woman