Difference between revisions of "Speech Technologies"

From HCE Wiki - The Human Cognitive Enhancement Wiki
Jump to navigation Jump to search
(Historical overview)
(Historical overview)
Line 19: Line 19:
 
In the 19th century, researchers focused also on the help people, who lost their voice or have serious problem with their throat. Jan Nepomuk Czermak described the first laryngeal prosthesis in 1859. His attempt was followed by the introduction of various speech prosthesis and artificial larynges.<ref name="history elecrolarynx"/> Later on, an Austrian surgeon Theodore Billroth performed the successful total extirpation of the larynx.<ref>KAZI, R. A., et al. Christian Albert Theodor Billroth: Master of surgery. Journal of postgraduate medicine, 2004, 50.1: 82. Available online at: https://tspace.library.utoronto.ca/bitstream/1807/2074/1/jp04025.pdf (Retrieved 25th February, 2016).</ref>
 
In the 19th century, researchers focused also on the help people, who lost their voice or have serious problem with their throat. Jan Nepomuk Czermak described the first laryngeal prosthesis in 1859. His attempt was followed by the introduction of various speech prosthesis and artificial larynges.<ref name="history elecrolarynx"/> Later on, an Austrian surgeon Theodore Billroth performed the successful total extirpation of the larynx.<ref>KAZI, R. A., et al. Christian Albert Theodor Billroth: Master of surgery. Journal of postgraduate medicine, 2004, 50.1: 82. Available online at: https://tspace.library.utoronto.ca/bitstream/1807/2074/1/jp04025.pdf (Retrieved 25th February, 2016).</ref>
  
20th century was an important breakthrough in various fields of speech technologies. The speech synthesis started to be mechanized by the introduction of Voder in 1930.<ref name="voder"/> New techniques of voice synthesis also made the synthetic voice sounding more natural and lately allow to preserve the voices of patients, who loosing their voice.<ref>JŮZOVÁ, M., TIHELKA, D., MATOUŠEK, J. Designing High-Coverage Multi-level Text Corpus for Non-professional-voice Conservation. In: Speech and Computer. Volume 9811 of the series Lecture Notes in Computer Science. Cham: Springer, 2016, pp 207-215. Doi: 10.1007/978-3-319-43958-7_24 Available online at: http://link.springer.com/chapter/10.1007/978-3-319-43958-7_24 (Retrieved 16th February, 2017).</ref> The first electrolarynges were introduced in 1942 by Wright.<ref name="history elecrolarynx"/> Surprisingly, the first tracheoesophageal voice prosthesis was not developed by a professional, but it was conducted by a patient using a red hot ice pick in 1931. The surgeons was, however, unable to replicate this procedure.<ref>BLOM, Eric D. Current Status of Voice Restoration Following Total Laryngectomy. Oncology [online]. 2000, Jun 1. Available online at: http://www.cancernetwork.com/head-neck-cancer/current-status-voice-restoration-following-total-laryngectomy (Retrieved 19th January, 2017).</ref> Therefore, it was abandoned until Erwin Mozolewski presented his tracheoesophageal voice prosthesis.<ref>MOZOLEWSKI, Erwin S., et al. "Arytenoid vocal shunt in laryngectomized patients." The Laryngoscope 85.5 (1975): 853-861.</ref>
+
20th century was an important breakthrough in various fields of speech technologies. The speech synthesis started to be mechanized by the introduction of Voder in 1930.<ref name="voder"/> New techniques of voice synthesis also made the synthetic voice sounding more natural and lately allow to preserve the voices of patients, who loosing their voice.<ref>JŮZOVÁ, M., TIHELKA, D., MATOUŠEK, J. Designing High-Coverage Multi-level Text Corpus for Non-professional-voice Conservation. In: Speech and Computer. Volume 9811 of the series Lecture Notes in Computer Science. Cham: Springer, 2016, pp 207-215. Doi: 10.1007/978-3-319-43958-7_24 Available online at: http://link.springer.com/chapter/10.1007/978-3-319-43958-7_24 (Retrieved 16th February, 2017).</ref> The first electrolarynges were introduced in 1942 by Wright.<ref name="history elecrolarynx"/> Surprisingly, the first tracheoesophageal voice prosthesis was not developed by a professional, but it was conducted by a patient using a red hot ice pick in 1931. The surgeons was, however, unable to replicate this procedure.<ref>BLOM, Eric D. Current Status of Voice Restoration Following Total Laryngectomy. Oncology [online]. 2000, Jun 1. Available online at: http://www.cancernetwork.com/head-neck-cancer/current-status-voice-restoration-following-total-laryngectomy (Retrieved 19th January, 2017).</ref> Therefore, it was abandoned until Erwin Mozolewski presented his tracheoesophageal voice prosthesis.<ref>MOZOLEWSKI, Erwin S., et al. "Arytenoid vocal shunt in laryngectomized patients." The Laryngoscope 85.5 (1975): 853-861.</ref>  
  
"Audrey", the software which was able to recognised digits spoken by a single voice.<ref name="Siri history"/>
+
In the middle of 20th century, the first speech recognition system were introduced. The first system was "Audrey" which was able to recognised digits spoken by a single voice. It was followed by IBM's "Shoebox" presented at 1962 World's Fair. It was able to recognised 16 English words. Another important system for voice recognition was "Harpy", which was developed by U.S. Department of Defense between years 1971 and 1976. It could recognise 1011 words similarly as 3 years old child.<ref name="Siri history"/>  
  
 
== Important Dates ==
 
== Important Dates ==

Revision as of 11:06, 28 February 2017

Speech technologies are technologies or devices that can understand and/or produce human-like speech. The speech generation is useful in applications such as text-to-speech, electrolarynges, speech prostheses or intelligent personal assistants. The former three technologies are used as a medical devices for people, who lost their voice. Speech synthesizers are also incorporated into devices which helped visually disabled people. Intelligent Personal Assistants allow the users to use their devices hands-free by merely saying required commands, mostly in plain, natural speech.

The speech technologies deals with voice, which is the dominant tool of interpersonal communication.[1] The importance of the voice was acknowledged also by the fact that 16th April was chosen as World Voice Day.[2]

http://archive.is/YU9D

http://www.speechatsri.com/products/eduspeak.shtml

http://www.speechtechmag.com/Articles/Editorial/Cover-Story/The-2015-State-of-the-Speech-Technology-Industry-Speech-Engine-101922.aspx

Main characteristics

Speech technologies could be divided between technologies used in medicine and technologies for commercial use. While the former group is represented primarily by electrolarynges and speech prostheses, intelligent personal assistants belong to the latter category. Speech synthesis is used for both purposes. It is contained in intelligent personal assistants or GPS navigations, but also in systems for visually impaired and speech synthesizers for people who lost their voice.[3] These technologies appear in two forms. It could be devices, software or a combination of both.

Historical overview

The first speaking machines were developed in Antiquity and Middle Ages. Nonetheless, they were not genuine speaking machines since they depended on people speaking inside of them. The first genuine speaking machine was introduced by Hungarian civil servant and inventor Wolfgan von Kempelen. He described his speech synthesiser in a book "Mechanismus der menschlichen Sprache nebst der Beschreibung seiner sprechenden Maschin" [The Mechanism of Human Speech, with a Description of a Speaking Machine] published in 1791.[4]

In the 19th century, researchers focused also on the help people, who lost their voice or have serious problem with their throat. Jan Nepomuk Czermak described the first laryngeal prosthesis in 1859. His attempt was followed by the introduction of various speech prosthesis and artificial larynges.[5] Later on, an Austrian surgeon Theodore Billroth performed the successful total extirpation of the larynx.[6]

20th century was an important breakthrough in various fields of speech technologies. The speech synthesis started to be mechanized by the introduction of Voder in 1930.[7] New techniques of voice synthesis also made the synthetic voice sounding more natural and lately allow to preserve the voices of patients, who loosing their voice.[8] The first electrolarynges were introduced in 1942 by Wright.[5] Surprisingly, the first tracheoesophageal voice prosthesis was not developed by a professional, but it was conducted by a patient using a red hot ice pick in 1931. The surgeons was, however, unable to replicate this procedure.[9] Therefore, it was abandoned until Erwin Mozolewski presented his tracheoesophageal voice prosthesis.[10]

In the middle of 20th century, the first speech recognition system were introduced. The first system was "Audrey" which was able to recognised digits spoken by a single voice. It was followed by IBM's "Shoebox" presented at 1962 World's Fair. It was able to recognised 16 English words. Another important system for voice recognition was "Harpy", which was developed by U.S. Department of Defense between years 1971 and 1976. It could recognise 1011 words similarly as 3 years old child.[11]

Important Dates

  • 1769 - Wolfgang von Kempelen developed the first genuine speech synthesizer[12]
  • 1859 - the first pneumatic laryngeal prosthesis was introduced by Jan Nepomuk Czermak[5]
  • 1873 - Billroth conducted the first successful total laryngectomy[5]
  • 1931 - the first laryngeal puncture was conducted by a patient[13]
  • 1937 - the speech synthesizer Voder was unveiled[7]
  • 1942 - Wright developed the first electrolarynx 'Sonovox'[14]
  • 1952 - Bell Laboratories presented "Audrey"[11]
  • 1972 - Erwin Mozolewski introduced a tracheoesophageal voice prosthesis[15]
  • 1976 - "Harpy" was developed[11]
  • 1987 - Apple Knowledge Navigator was presented[16]
  • 4th February 2010 - Siri Inc. unveiled Siri[17]
  • 6th November 2014 - Amazon.com, Inc introduced Amazon Echo[18]

Enhancement/Therapy/Treatment

http://www.pcadvisor.co.uk/feature/software/how-digital-assistants-are-replacing-our-brains-3530140/

Ethical & Health Issues

http://www.techadvisor.co.uk/opinion/internet/no-one-cares-about-privacy/

Public & Media Impact and Presentation

Public Policy

Related Technologies, Projects or Scientific Research

http://www.speechtechmag.com/Articles/News/Speech-Technology-News-Features/IBM-Makes-Watson-TTS-More-Expressive--109477.aspx

http://link.springer.com/journal/10772

References

  1. LALWANI, Mona. Personal assistants are ushering in the age of AI at home. Engadget [online]. 2016, Oct 5. Available online at: https://www.engadget.com/2016/10/05/personal-assistants-google-home-ai/ (Retrieved 5th January, 2017).
  2. SIEGEL-ITZKOVICH, Judy. Voice of the people. The Jerusalem Post [online]. 2015, Apr 26. Available online at: http://www.jpost.com/Israel-News/Health/Voice-of-the-people-399185 (Retrieved 17th January, 2017).
  3. TAYLOR, Paul. Text-to-Speech Synthes. University of Cambridge Department of Engineering [online]. 2014. Available online at: http://mi.eng.cam.ac.uk/~pat40/ttsbook_draft_2.pdf (Retrieved 2nd February, 2017).
  4. DUDLEY, Homer, TARNOCZY, T. H. The speaking machine of Wolfgang von Kempelen. Journal of the Acoustical Society of America 22, 151-166. Doi:10.1121/1.1906583. Available online at: http://pubman.mpdl.mpg.de/pubman/item/escidoc:2316415:3/component/escidoc:2316414/Dudley_1950_Speaking_machine.pdf (Retrieved 2nd February, 2017).
  5. 5.0 5.1 5.2 5.3 KEITH, Robert L., SHANKS, James, Laryngectomee Rehabilitation: Past and Present. In: Speech and Language: Advances in Basic research and Practice. New York: Academic Press, 1983. Available online at: https://books.google.cz/books?id=0C60BQAAQBAJ&pg=PA126&lpg=PA126&dq=Cooper-Rand+electrolarynx&source=bl&ots=or27eudDf2&sig=22heagC08Fpk57qGILvufHxwCyM&hl=cs&sa=X&ved=0ahUKEwi1--bt3r7RAhWGuxQKHbrFCC04ChDoAQgdMAE#v=onepage&q=Cooper-Rand%20electrolarynx&f=false (Retrieved 13th January, 2017).
  6. KAZI, R. A., et al. Christian Albert Theodor Billroth: Master of surgery. Journal of postgraduate medicine, 2004, 50.1: 82. Available online at: https://tspace.library.utoronto.ca/bitstream/1807/2074/1/jp04025.pdf (Retrieved 25th February, 2016).
  7. 7.0 7.1 DUNCAN. Klatt’s Last Tapes: A History of Speech Synthesisers. Communication Aids [online]. 2013, Aug 10. Available online at: http://communicationaids.info/history-speech-synthesisers (Retrieved 2nd February, 2017).
  8. JŮZOVÁ, M., TIHELKA, D., MATOUŠEK, J. Designing High-Coverage Multi-level Text Corpus for Non-professional-voice Conservation. In: Speech and Computer. Volume 9811 of the series Lecture Notes in Computer Science. Cham: Springer, 2016, pp 207-215. Doi: 10.1007/978-3-319-43958-7_24 Available online at: http://link.springer.com/chapter/10.1007/978-3-319-43958-7_24 (Retrieved 16th February, 2017).
  9. BLOM, Eric D. Current Status of Voice Restoration Following Total Laryngectomy. Oncology [online]. 2000, Jun 1. Available online at: http://www.cancernetwork.com/head-neck-cancer/current-status-voice-restoration-following-total-laryngectomy (Retrieved 19th January, 2017).
  10. MOZOLEWSKI, Erwin S., et al. "Arytenoid vocal shunt in laryngectomized patients." The Laryngoscope 85.5 (1975): 853-861.
  11. 11.0 11.1 11.2 PINOLA, Melanie. Speech Recognition Through the Decades: How We Ended Up With Siri. PCWorld [online]. 2011, Nov 2. Available online at: http://www.pcworld.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html (Retrieved 28th February, 2017).
  12. WOODFORD, Chris. Speech synthesizers. EXPLAINTHATSTUFF [online]. 2017, Jan 21. Available online at: http://www.explainthatstuff.com/how-speech-synthesis-works.html (Retrieved 16th February, 2017).
  13. BLOM, Eric D. Current Status of Voice Restoration Following Total Laryngectomy. Oncology [online]. 2000, Jun 1. Available online at: http://www.cancernetwork.com/head-neck-cancer/current-status-voice-restoration-following-total-laryngectomy (Retrieved 19th January, 2017).
  14. LIU, Hanjun, NG, Manwa L. Electrolarynx in voice rehabilitation. Auris Nasus Larynx, 2007, 34.3: 327-332.
  15. TARNOWSKA, Czesława. Wspomnienie o profesorze Erwinie Mozolewskim. Pomorski Uniwersytet Medyczny w Szczecinie [online]. Available online at: https://www.pum.edu.pl/__data/assets/file/0009/14868/Wspomnienie_o_profesorze_Erwin_7517.pdf (Retrieved 19th January, 2017).
  16. DigiBarn Computer Museum. The Knowledge Navigator concept piece by Apple Computer (1987). DigiBarn Computer Museum [online]. Available online at: http://www.digibarn.com/collections/movies/knowledge-navigator.html (Retrieved 5th January, 2017).
  17. HARRISON, Natalie and BREWER, Teresa. Apple Launches iPhone 4S, iOS 5 & iCloud. Apple [online]. 2011. Oct 4. Available online at: http://www.apple.com/pr/library/2011/10/04Apple-Launches-iPhone-4S-iOS-5-iCloud.html (Retrieved 16th December, 2016).
  18. WELCH, Chris. Amazon just surprised everyone with a crazy speaker that talks to you. The Verge [online]. 2014, Nov 6. Available online at: http://www.theverge.com/2014/11/6/7167793/amazon-echo-speaker-announced (Retrieved 20th December, 2016).