This dataset spans a number of big news events (the Olympics; both US presidential nominating conventions; the beginnings of the financial crisis; ...) as well as everything else you might expect to find posted to blogs."[ICWSM09|http://www.icwsm.org/2009/data/] | compressed XML (tar.gz) | [HTTPS|https://webstar.deri.ie/datasets/icwsm2009/] | NO |
| SINDICE-DUMP | This data set is a sindice dump from March 2009. Most of the sindice dump can be also find in the BTC09 data set | [NQ format|http://sw.deri.org/2008/07/n-quads/] \- 100GB uncompressed, 7.5GB gz | [HTTPS|https://webstar.deri.ie/datasets/sindice/] | NO |
| boards.ie \\ | Ten years of discussions from the Irish forum site [boards.ie|http://boards.ie] \- from year 1998 to 2008.  Transformed form from the SIOC format. All forums, threads, users, posts - a [general description of the data|http://wiki.sioc-project.org/index.php/Data/Boards.ie/Structure] is available. There is also a [graphical representation|https://dev.deri.ie/confluence/download/attachments/16777326/boards.ie-schema.png] of the schema available and some simple [example queries|https://dev.deri.ie/confluence/download/attachments/16777326/boards.ie-exmpl-queries.txt]. Raw data and SPARQL access only available after agreeing to a [license|https://dev.deri.ie/confluence/download/attachments/16777326/boards.ie-license.txt], please contact Marcel Karnstedt from the UIMR unit. \\ | RDF/XML \\
N-Triples \\ | \-\- \\ | YES \\ |