TAILIEUCHUNG - Báo cáo khoa học: "A Comprehensive Gold Standard for the Enron Organizational Hierarchy"

Many researchers have attempted to predict the Enron corporate hierarchy from the data. This work, however, has been hampered by a lack of data. We present a new, large, and freely available gold-standard hierarchy. Using our new gold standard, we show that a simple lower bound for social network-based systems outperforms an upper bound on the approach taken by current NLP systems. | A Comprehensive Gold Standard for the Enron Organizational Hierarchy Apoorv Agarwal1 Adinoyi Omuya1 Aaron Harnly2 f Owen Rambow3 ị 1 Department of Computer Science Columbia University New York NY USA 2 Wireless Generation Inc. Brooklyn NY USA 3 Center for Computational Learning Systems Columbia University New York NY USA apoorv@ awo210 8@ faaron@ frambow@ Abstract Many researchers have attempted to predict the Enron corporate hierarchy from the data. This work however has been hampered by a lack of data. We present a new large and freely available gold-standard hierarchy. Using our new gold standard we show that a simple lower bound for social network-based systems outperforms an upper bound on the approach taken by current NLP systems. 1 Introduction Since the release of the Enron email corpus many researchers have attempted to predict the Enron corporate hierarchy from the email data. This work however has been hampered by a lack of data about the organizational hierarchy. Most researchers have used the job titles assembled by Shetty and Adibi 2004 and then have attempted to predict the relative ranking of two people s job titles Rowe et al. 2007 Palus et al. 2011 . A major limitation of the list compiled by Shetty and Adibi 2004 is that it only covers those core employees for whom the complete email inboxes are available in the Enron dataset. However it is also interesting to determine whether we can predict the hierarchy of other employees for whom we only have an incomplete set of emails those that they sent to or received from the core employees . This is difficult in particular because there are dominance relations between two employees such that no email between them is available in the Enron data set. The difficulties with the existing data have meant that researchers have either not performed quantitative analyses Rowe et 161 al. 2007 or have performed them on very small sets for example Bramsen .

TỪ KHÓA LIÊN QUAN
TAILIEUCHUNG - Chia sẻ tài liệu không giới hạn
Địa chỉ : 444 Hoang Hoa Tham, Hanoi, Viet Nam
Website : tailieuchung.com
Email : tailieuchung20@gmail.com
Tailieuchung.com là thư viện tài liệu trực tuyến, nơi chia sẽ trao đổi hàng triệu tài liệu như luận văn đồ án, sách, giáo trình, đề thi.
Chúng tôi không chịu trách nhiệm liên quan đến các vấn đề bản quyền nội dung tài liệu được thành viên tự nguyện đăng tải lên, nếu phát hiện thấy tài liệu xấu hoặc tài liệu có bản quyền xin hãy email cho chúng tôi.
Đã phát hiện trình chặn quảng cáo AdBlock
Trang web này phụ thuộc vào doanh thu từ số lần hiển thị quảng cáo để tồn tại. Vui lòng tắt trình chặn quảng cáo của bạn hoặc tạm dừng tính năng chặn quảng cáo cho trang web này.