jtrant's picture

@ ASIST SIG-CR: second session

Exploring characteristics of social classification
Xia Lin, Joan E. Beaudoin, Yen Bui, and Kaushal Desai (Drexel University, USA)
http://www.slais.ubc.ca/users/sigcr/sigcr-06lin.pdf

- reference to his sig-cr paper in 1997 that talked about distributed indexing
- paper reports on work in doctoral seminar @ Drexel
- IMLS-funded project: interdisciplinary perspectives on information organization in the digital environment
- Is there a paradigm shift from traditional information organization to digital information organization?
- contrasts traditional/thesaurus (concept-based, static, domain-specific, constructed, independent of documents) with digital/dynamic (connection-based, dynamic, social-network based [ jt-same words ≠ social network] , large-scale, integrated document and concept space)

Experiments done in class
1. compare terms w controlled vocabulary

- Connotea citations with PubNet ID enables correlation of tags/user-supplied metadata with bibliographic record with MeSH headings.
--> tagging / MESH Mesh 1034, both = 54, tags = 540 [jt: what kinds of terms were those additional tags?]
--> tagging / automatic indexing had much higher correlation [matched tags with terms from titles]
- titles on tag input screen --> subliminal prompt
- tags more like natural language than thesaurus
- what are the purposes of tagging? self/others, concept/connection

2. categorizing tags into categories

- developed categories that were applied to flickr tags (5 people did classification)
- high occurence of place name
[jt - compound is a category -- why not broken-up and given multiple tags?]
- highest used are place-name, compound, thing, people
- modest use of evenet, photographic, time
- lots of unknown -- need personal knowledge of content

3. convergence of vocabulary

- if useful tags are based on their popularity, what is the point at which it kicks in?

- del.icio.us urls to look at convergence of tags
[stats in paper -- needs review]
- factors affecting convergence include # of tags, # of users that tagged URL, # of URLs tagged

q: are social tagging and automatic indexing the same because people are just using the words in the doc?
a: controlled vocab suggestions might alter this

q: as terms move in and out of vocab how do you track this?
a: could study terms over time, watch popularity curve

q: were the terms matched to main terms in mesh or main and alternatives
a: strict match to preferred term [jt: using thesaural structure would make the match more effective]

grant's q; time in tagging. Assume that document is created, and then the indexing takes place. But with tagging, things seem to be happening in multiple times. what are the relationships between time, documents and tagging
a; [jt: need to study tag re-use]. Watch tag space for a url and see how it evolves. Assignment: all tag same article and compare. Watch what others do [jt: copy, augment, revise], could be a good study.

q: different kinds of tagging spaces in flickr and del.icio.us -- have these been compared.
a; no observations/connections made.
[jt -- user behaviour in 2 spaces are different, purposes divergent.]

Searching the long tail: Hidden structure in social tagging
Emma Tonkin (UKOLN, UK)
http://www.slais.ubc.ca/users/sigcr/sigcr-06tonkin.pdf

Start with Zipf distribution / power law / long tail

Study of real-world tag use

- many tags are compound, complex, geotag

rationale: semantic web = ... want to slot behaviours into pre-existing concepts ... so tags= controlled vocabulary, and folksonomy=ontology.. but if simplicity and speed is the goal then ... [conflict]

tags are not always keywords [describing content], sometimes they are interpretations, annotations, task-related, and therefore pre-coordination and word order matters. grammatical structure in tag string: man holding a dead rose vs dead man rose. [jt- eats, shoots, and leaves]

- compound terms essential for semantics, decomposition supports retrieval.
- problems of partial matching

- standard mechanisms for de-compostion of compound terms
- prefix, suffix, lexicon
BUT: tag compounds don't correspond to this as they often represent multiple concepts and have idiosyncratic content + structure

Tests of ways to do this --> see the paper.
hybrid approach most effective.

- many questions remaining about deployment

--> sometimes users are 'untidy' for a reason. [jt: need to understand meaning in variation]

q: calvin moore - terms are not descriptors
"it all depends on what you are trying to get the user to do" [jt: who's leading???]

q jt: what about the place between the individual and the broader group -- the smaller network / professional or personal that produces the 'socio-lects' that marc davis / danah boyd were talking about in their various papers (chi, www2006).

Expertise classification: Collaborative classification vs. automatic extraction [PDF]
Toine Bogers, Willem Thoonen, and Antal van den Bosch (Tilburg University, The Netherlands)

http://www.slais.ubc.ca/users/sigcr/sigcr-06bogers.pdf

- what is an expert?

- evidence is in output (articles,papers, presentations)
- TREC expert search task (find a person with expertise, rather than a document with an answer)

- contrast automatic extraction with expertise identified by tagging

Susan's asked nownow.com about experts on folksonomies....

q: vocabularies to describe expertise?
a: drawn from the university's classification

q: time as a factor?
a: weighting of old papers is less ... recently cited vs not cited in last n years.

q: joseph busch: have you looked at the expertise space as a whole? visualizations of disciplines?
a: visualizations on his list; have reviewed distribution of expertise across faculty (ego in self-description)

q: jens-eric expertise is defined contextually, in some contexts i'm an expert, in others i'm not

Reply

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.