jtrant's picture

@www2007 in Banff -- tagging workshop

banff springs hotelI'm in Banff this week, at the www2007 meeting. i'm on the program committee for the tagging workshop that is taking place today, and will stay for a bit to see what the technical side of the web is thinking. These are my notes from the things that i've attended (or wish i could attend)...

 


Workshop: Tagging and Metadata for Social Information Organization

http://www2007.org/workshop-W9.php

 

Network Properties of Folksonomies
Christoph Schmitz, Miranda Grahl, Andreas Hotho, Gerd Stumme, Ciro Cattuto, Andrea Baldassarri, Vittorio Loreto, and Vito D. P. Servedio

http://www2007.org/workshops/paper_13.pdf

Christoph Schmitz presents the co-authored paper.

  • Small worlds (large clustering coefficient but small path length)
  • used data from BibSonomy and del.icio.us
  • characteristic path lengths similar in the two instances

  • tag co-occurrence
    • looked at co-occurrence of tags applied to the same resource by the same user
    • anomolies in graph lines related to spamming activities (large numbers of tags, and same number of tags for multiple posts e.g 50 or 100)
    • threshold set by number of tags to remove these
  • nearest neighbour strength
    • similar for bibSonomy and del.icio.us

jt - what does this mathematical description of tags tell us about their nature, or the nature of the systems in which users apply them?

hope that this is useful for spam detection (mathematical identification of anomolies)

sees work as primarily descriptive -- seeing if complex system theory can offer insights

 

Tag-Cloud Drawing: Algorithms for Cloud Visualization
Owen Kaser and Daniel Lemire

http://www2007.org/workshops/paper_12.pdf

Owen Kaser presents -- focus on how to compute a good tag cloud layout

Assumptions

  • fixed width rectangles whose height can vary
  • bounding boxes for tags inside a larger box, for the cloud

Goals

  • asserts that sematically related things should have a high proximity
  • goal to render with HTML in-line elements [not tables]
  • white space is undesirable [unless separating tags]

Problems

  • order of tags (alphabetic)
  • Knuth-Plass algorithm for 'full justified paragraphs'
  • when height introduced 'badness measures' increased

  • tag arrangement / rearrangement allowed
  • look at other problems for placing rectangles into rectangles

  • semantic association
  • 'min-cut placement' -- keeps strongly associated things together

discussion: Usability of tag clouds

  • Scott mentioned CHI paper on tag usabiility from IBM
  • poster paper tomorrow on tag positioning
  • well-understood [alpha] may be more intuitive than meaning-laden [sematic]
  • Andy Edmonds (Atlanta) mouse-over interface showing co-occurrence of tags
  • seb's work @ the Powerhouse (relation of tag clouds to structured vocabularies)

jt - it all comes down to the purpose of the tags -- what is it we are trying to do with tag clouds anyway?

 

Semkey: A Semantic Collaborative Tagging System
Andrea Marchetti, Maurizio Tesconi, Francesco Ronzano, Marco Rosella and Salvatore Minutoli

http://www2007.org/workshops/paper_45.pdf

Andrea Marchetti presents the paper

main weakness in tagging = lack of relationships between concepts and tags

  • synonomy, lexical forms, typos or alternative spellings, levels of precision, multi-linguality
  • use concepts instead of tags, expressed as RDF
  • usability important (add one click to make sematic assertion?)
  • properties: hasAsTopic, kindOfResource, myOpinionIs
  • shared sematic resources: general concepts, universally accepted, concepts by tag -- disabiguation

experiment uses

  • wikipedia data (useful disambiguation pages for polysemy, redirect pages for synonymy, limited multi-linguality)
  • Wordnet
  • OmegaWiki

wikipedia covers 140 most popular tags in del.icio.us

SemKey Demo -- see http://www.semkey.org/

  • firefox extension, contributed terms to shared resource
  • also supports tag browsing

questions

scott bateman: usability of this kind of system with additional steps?

frank: why would people be motivated to perform these additional tasks?

comment: within closed community (population studies?)

frank: can we shift more intelligence to the back-end so that the user does less?

desire to define concepts at input rather than later?

thomas?: highly polysemic terms could be prompted for disambiguation -- push the knowledge out to the user?

smaller communities -- with motivation and confined sematics as an appropriate test group?

jt- this seems a really heavy load to add to the tagging activity (worse that steve.museum's contemplated facets...)

Reply

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.