Project

General

Profile

1
<?xml version="1.0" encoding="UTF-8"?>
2
<record xmlns="http://www.openarchives.org/OAI/2.0/">
3
   <header>
4
    <identifier>oai:pumaoai.isti.cnr.it:cnr.isti/cnr.isti/2015-A3-001</identifier>
5
    <datestamp>2015-03-14</datestamp>
6
    <setSpec>openaire</setSpec>
7
   </header>
8
   <metadata>
9
     <oai_dc:dc xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/        http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
10
      <dc:title>Structured prediction for quantification</dc:title>
11
      <dc:creator>Esuli, Andrea</dc:creator>
12
      <dc:creator>Sebastiani, Fabrizio</dc:creator>
13
      <dc:subject>Quantification</dc:subject>
14
      <dc:subject>Structured output prediction</dc:subject>
15
      <dc:subject>info:eu-repo/classification/acm/I.2.6 ARTIFICIAL INTELLIGENCE. Learning</dc:subject>
16
      <dc:description>We address the problem of quantification, a supervised learning task whose goal is, given a class, to estimate the relative frequency (or prevalence) of the class in a dataset of unlabelled items. Quantification has several applications in data and text mining, such as estimating the prevalence of positive reviews in a set of reviews of a given product, or estimating the prevalence of a given support issue in a dataset of transcripts of phone calls to tech support. So far, quantification has been addressed by learning a general-purpose classifier, counting the unlabelled items which have been assigned the class, and tuning the obtained counts according to some heuristics. In this paper we depart from the tradition of using general-purpose classifiers, and use instead a supervised learning model for structured prediction, capable of generating classifiers directly optimized for the (multivariate and non-linear) function used for evaluating quantification accuracy. The experiments that we have run on 5500 binary high-dimensional datasets (averaging more than 14,000 documents each) show that this method is more accurate, more stable, and more efficient than existing, state-of-the-art quantification methods.</dc:description>
17
      <dc:date>2015</dc:date>
18
      <dc:type>info:eu-repo/semantics/conferenceObject</dc:type>
19
      <dc:identifier>http://puma.isti.cnr.it/dfdownloadnew.php?ident=cnr.isti/cnr.isti/2015-A3-001</dc:identifier>
20
      <dc:language>en</dc:language>
21
      <dc:source>In: MLDAS 2015 - 2nd Machine Learning and Data Analytics Symposium (Doha, Qatar, 8-9 March 2015).</dc:source>
22
      <dc:format>application/pdf</dc:format>
23
      <dc:identifier>http://puma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2015-A3-001/2015-A3-001.pdf</dc:identifier>
24
      <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
25
     </oai_dc:dc>
26
   </metadata>
27
  </record>
(68-68/150)