Skip to main content

Search

Items tagged with: Linguistics


Hot off the press in the International Journal of Learner Corpus Research: "The more proficient the learners, the less sophisticated their L2 vocabulary? The curious effect of the reference corpus on mean-frequency measures of lexical sophistication" co-authored with @RaffaellaBottini! 🤗

#CorpusLinguistics #LearnerLanguage #linguistics #OpenAccess

jbe-platform.com/content/journ…


Grandma used the word “whatsome” a lot. I've never heard anyone else say it. I often wonder where it came from.

Curiously, the Oxford Dictionary defines it as an obsolete #MiddleEnglish word meaning “whatever” that hasn't been used in over 500 years.

“Whatsome” was Grandma's “whatchamacallit”. She could also say “and whatsome” in the sense of “and so on”.

Incidentally, Oxford recognises “whatsomever” as a surviving #dialect word.

#Linguistics #HistoricalLinguistics #English #Etymology


If you're a #language nerd like I am, then you won't have missed the @mozilla #CommonVoice v19 #speech #dataset release - which now features 131 languages! Here's my #dataviz, done in @observablehq of the v19 #metadata coverage.

I've updated the visualisation this time around with human-readable language names instead of their ISO-639 or BCP-47 language codes to make it it easier to read.

There's some interesting observations:

▶ Catalan (ca) continues to be leader in terms of data - speaking volumes about the efforts to revitalise culture and language in Catalunya. It's also one of the few languages that has data for all age groups, particularly older speakers - this sort of data is missing for most other languages.

▶ Kiswahili (sw) is one of the languages where there is more data for female-identifying speakers than for male-identifying speakers ♀ - although Japanese (ja), Western Mari (mrj) and Luganda (lg) do pretty well here, too!

▶ Sentence domains can now be categorised, and although most new sentences are "general", Albanian (sq) has a lot of sentences related to law and government.

▶ Tsonga (ts), a Bantu language spoken in Southern Africa, has dethroned Icelandic (is) as the language with the highest average utterance duration. I don't know enough about Tsonga to speculate why - it's a somewhat agglutinative language, but many Tsonga works are generally short.

▶ Bengali / Bangla (bn) has a significant amount of data that is not yet validated, and therefore does not appear in training / dev / test splits. There is a similar case for many languages new to Common Voice - it takes time to validate.

▶ The language with the highest number of average contributions per speaker is Taita (dav), a Bantu language from Kenya.

What do you make of the data visualisation? Are there any other insights you can see?

Big thanks to the CV team for all their efforts - EM, Jessica Rose, Dmitrij Feller and Justin Grant.

#linguistics

observablehq.com/@kathyreid/mo…


Can #typos make lies seem less deceptive or make true statements seem less true?

“true statements with grammatical errors and unusual word choices were seen as more deceitful, and lie statements with the same language were seen as less deceptive” and “I discovered a new brain response that is sensitive to the difference between perceived truths and lies.”

Dissertation permalink: digitalcommons.usu.edu/etd2023…

DOI: doi.org/10.26076/068f-415c

#ethics #xPhi #decisionScience #neuroscience #psychology #business #communication #linguistics


A Language Log post about nonbinary honorifics and the etymology of Miss and Missus (from Latin Magistra "mistress" not Latin Magus and its Old Iranian relatives) lead me to this essay on how the connotations of each term have changed since the 18th century cam.ac.uk/research/news/mistre… #philology #linguistics


edit: still trying! please keep on boosting!

we're moving to #Prague! and we need some friends.
we're moving to #Czechia in September and we might be a bit lonely. so if you want to befriend a couple of quirky migrants, let me know!

I'm into #music, #FOSS, #linguistics, #DoctorWho, #TTRPG and #DropoutTV. oh, and I'm #trans.
my girlfriend likes #medicine, British television, #Disney, #Eurovision and #TrueCrime. "I am the bisexual stereotype".

:boostRequest:
@prague

#Praha #Prag #PleaseBoost


Friendly reminder: the Finnish Journal of Linguistics is accepting submissions for its 2024 volume 🇫🇮 The deadline is 31 January.
Languages: English, Finnish, French, German, Swedish.
Fully open access, no article processing fees. All areas of (general) #linguistics welcome.
journal.fi/finjol/about


I am super excited about this mini-conference on #reproducibility in #linguistics that I am organising this evening: Four of my M.A. students will be reporting on their attempts to reproduce the results of four published quantitative linguistics papers for which the data is available, but not the code!

Colleagues, they have *a lot* of things to report! So, if you're in the area (Cologne), do come along! There will be #ReproducibiliTea and Christmas biscuits! 🍵 🍪 #OpenScience


This has been going around on Twitter, but I neglected my community here :) I'm sorry about this :)
Tomorrow at noon EST, I will give a #talk on the #accessibility of #language #learning and #linguistics in general for #screenReader users as part of the a11yTalks event. This will be a public event with no need to register so if this is something any of you are interested in, here's the link :) a11ytalks.com/posts/2023-MAY/ #speaker #a11y


JTG necesita a alguien que haga corpus analisis de estos dialectos (Mexico, Venezuela, Colombia, y Caribe) para hacer glosarios y otros materiales. Yo he trabajado con ellos y pagan bien. jobs.jtg-inc.com/x/detail/a29e… #Linguistics #Jobs @linguistics #Corpus


I have opened recruitment for an online experiment investigating cross linguistic perceptions of iconicity! If you are or know a deaf signer of Norwegian sign language please send them this announcement on the Norwegian Deaf Association webpage! #linguistics #signlanguage doveforbundet.no/nyheter/2023/…


Fascinating article!

"Unearthing a Long Ignored African Writing System, One Researcher Finds African History, by Africans: BU anthropologist Fallou Ngom discovered Ajami, a modified Arabic script, in a box of his late father’s old papers" posted December 21, 2022, written by Molly Callahan

bu.edu/articles/2022/fallou-ng…

#linguistics #Ajami #Africa


I’m writing an article that expands my microblog entry on stylometric fingerprinting to give more comprehensive advice. I am partially walking back on my recommendation not to use machine translation and adding information about reading levels, among other things. Would anybody familiar with #stylometry, or with a #linguistics background experienced with close-reading, be up for reviewing a rough draft next week?

I’d also be interested in how people may describe my own stylometric fingerprint (signature phrases, grammar quirks, etc), to use as an example.

Boosts appreciated.


I am:
🔵a #book lover
🔵mildly #burnedout but trying to take it easy 😊
🔵a #cat lover (sharing my place with 1 fluffy terrorist)
🔵#choleric (actively working on my tantrums though) 😂
🔵a good #communicator
🔵#compassionate
🔵#curious
🔵#Czech
🔵an #effectivealtruist (I translated Toby Ord's The Precipice into Czech)
🔵#empathic
🔵a #feminist
🔵#friendly
🔵a #linguistics nerd
🔵a #liberal
🔵a language #teacher
🔵a literary #translator
🔵very much in love with @stepan
🔵 a #woman