{"id":438153,"date":"2024-10-20T08:02:34","date_gmt":"2024-10-20T08:02:34","guid":{"rendered":"https:\/\/pdfstandards.shop\/product\/uncategorized\/ieee-3302-2022\/"},"modified":"2024-10-26T15:06:33","modified_gmt":"2024-10-26T15:06:33","slug":"ieee-3302-2022","status":"publish","type":"product","link":"https:\/\/pdfstandards.shop\/product\/publishers\/ieee\/ieee-3302-2022\/","title":{"rendered":"IEEE 3302-2022"},"content":{"rendered":"<p>New IEEE Standard &#8211; Active. This standard adopts MPAI Technical Specification Version 1.4 as an IEEE Standard. The Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Technical Specification Context-based Audio Enhancement (CAE) Version 1.4 is a collection of four use cases specifying AI-based technologies for audio-related applications including entertainment, communication, post-production, teleconferencing, and restoration.<\/p>\n<h4>PDF Catalog<\/h4>\n<table border=\"1px\" class=\"des-table\">\n<tr>\n<th>PDF Pages<\/th>\n<th>PDF Title<\/th>\n<\/tr>\n<tr>\n<td><b>1<b><\/td>\n<td>IEEE Std 3302\u2122-2022 Front Cover  <\/td>\n<\/tr>\n<tr>\n<td><b>2<b><\/td>\n<td>Title page  <\/td>\n<\/tr>\n<tr>\n<td><b>4<b><\/td>\n<td>Important Notices and Disclaimers Concerning IEEE Standards Documents  <\/td>\n<\/tr>\n<tr>\n<td><b>8<b><\/td>\n<td>Paticipants  <\/td>\n<\/tr>\n<tr>\n<td><b>9<b><\/td>\n<td>Introduction  <\/td>\n<\/tr>\n<tr>\n<td><b>10<b><\/td>\n<td>Specification for MPAI-CAE  <\/td>\n<\/tr>\n<tr>\n<td><b>11<b><\/td>\n<td>Contents  <\/td>\n<\/tr>\n<tr>\n<td><b>13<b><\/td>\n<td>1 Introduction (Informative)  <\/td>\n<\/tr>\n<tr>\n<td><b>14<b><\/td>\n<td>2 Scope of standard  <\/td>\n<\/tr>\n<tr>\n<td><b>15<b><\/td>\n<td>2.1 Emotion-Enhanced Speech (EES) <br \/> 2.2 Audio Recording Preservation (ARP) <br \/> 2.3 Speech Restoration System (SRS) <br \/> 2.4 Enhanced Audioconference Experience (EAE)  <\/td>\n<\/tr>\n<tr>\n<td><b>16<b><\/td>\n<td>2.5 Normative content of the Use Cases <br \/> 3 Terms and Definitions  <\/td>\n<\/tr>\n<tr>\n<td><b>18<b><\/td>\n<td>4 References <br \/> 4.1 Normative References  <\/td>\n<\/tr>\n<tr>\n<td><b>19<b><\/td>\n<td>4.2 Informative References <br \/> 5 Use Case Architectures <br \/> 5.1 Emotion-Enhanced Speech (EES) <br \/> 5.1.1 Scope of Use Case <br \/> 5.1.2 I\/O data  <\/td>\n<\/tr>\n<tr>\n<td><b>20<b><\/td>\n<td>5.1.3 Implementation Architecture <br \/> 5.1.4 AI Modules  <\/td>\n<\/tr>\n<tr>\n<td><b>21<b><\/td>\n<td>5.2 Audio Recording Preservation (ARP) <br \/> 5.2.1 Scope of Use Case <br \/> 5.2.2 I\/O data <br \/> 5.2.3 Implementation Architecture  <\/td>\n<\/tr>\n<tr>\n<td><b>23<b><\/td>\n<td>5.2.4 AI Modules <br \/> 5.3 Speech Restoration System (SRS) <br \/> 5.3.1 Scope of Use Case  <\/td>\n<\/tr>\n<tr>\n<td><b>24<b><\/td>\n<td>5.3.2 I\/O Data <br \/> 5.3.3 Implementation Architecture  <\/td>\n<\/tr>\n<tr>\n<td><b>25<b><\/td>\n<td>5.3.4 AI Modules <br \/> 5.4 Enhanced Audioconference Experience (EAE) <br \/> 5.4.1 Scope of Use Case  <\/td>\n<\/tr>\n<tr>\n<td><b>26<b><\/td>\n<td>5.4.2 I\/O data <br \/> 5.4.3 Implementation Architecture  <\/td>\n<\/tr>\n<tr>\n<td><b>27<b><\/td>\n<td>5.4.4 AI Modules  <\/td>\n<\/tr>\n<tr>\n<td><b>28<b><\/td>\n<td>6 AIMs <br \/> 6.1 AIM Interoperability <br \/> 6.2 AIMs and their data <br \/> 6.2.1 Emotion Enhanced Speech <br \/> 6.2.2 Audio Recording Preservation (ARP)  <\/td>\n<\/tr>\n<tr>\n<td><b>29<b><\/td>\n<td>6.2.3 Speech Restoration System (SRS) <br \/> 6.2.4 Enhanced Audioconference Experience (EAE) <br \/> 6.3 Data Formats  <\/td>\n<\/tr>\n<tr>\n<td><b>30<b><\/td>\n<td>6.3.1 Access Copy Files <br \/> 6.3.2 Audio Scene Geometry <br \/> 6.3.2.1 Syntax  <\/td>\n<\/tr>\n<tr>\n<td><b>31<b><\/td>\n<td>6.3.2.2 Semantics  <\/td>\n<\/tr>\n<tr>\n<td><b>32<b><\/td>\n<td>6.3.3 Damaged List <br \/> 6.3.3.1 Syntax <br \/> 6.3.3.2 Semantics <br \/> 6.3.4 Denoised Speech  <\/td>\n<\/tr>\n<tr>\n<td><b>33<b><\/td>\n<td>6.3.5 Editing List <br \/> 6.3.5.1 Syntax  <\/td>\n<\/tr>\n<tr>\n<td><b>34<b><\/td>\n<td>6.3.5.2 Semantics  <\/td>\n<\/tr>\n<tr>\n<td><b>35<b><\/td>\n<td>6.3.6 Emotion <br \/> 6.3.6.1 Syntax <br \/> 6.3.6.2 Semantics  <\/td>\n<\/tr>\n<tr>\n<td><b>39<b><\/td>\n<td>6.3.7 Emotionless Speech <br \/> 6.3.8 Interleaved Multichannel Audio <br \/> 6.3.9 Irregularity File <br \/> 6.3.9.1 Syntax  <\/td>\n<\/tr>\n<tr>\n<td><b>40<b><\/td>\n<td>6.3.9.2 Semantics  <\/td>\n<\/tr>\n<tr>\n<td><b>42<b><\/td>\n<td>6.3.10 Irregularity Image <br \/> 6.3.11 Microphone Array Audio <br \/> 6.3.12 Microphone Array Geometry <br \/> 6.3.12.1 Syntax  <\/td>\n<\/tr>\n<tr>\n<td><b>43<b><\/td>\n<td>6.3.12.2 Semantics  <\/td>\n<\/tr>\n<tr>\n<td><b>44<b><\/td>\n<td>6.3.13 Mode Selection  <\/td>\n<\/tr>\n<tr>\n<td><b>45<b><\/td>\n<td>6.3.14 Multichannel Audio Stream <br \/> 6.3.15 Neural Network Speech Model  <\/td>\n<\/tr>\n<tr>\n<td><b>46<b><\/td>\n<td>6.3.16 Preservation Audio File <br \/> 6.3.17 Preservation Audio-Visual File <br \/> 6.3.18 Preservation Master Files <br \/> 6.3.19 Source Dictionary <br \/> 6.3.20 Source Model KB Query Format <br \/> 6.3.21 Speech Features  <\/td>\n<\/tr>\n<tr>\n<td><b>47<b><\/td>\n<td>6.3.21.1  Semantics  <\/td>\n<\/tr>\n<tr>\n<td><b>48<b><\/td>\n<td>6.3.22 Spherical Harmonic Decomposition <br \/> 6.3.23 Transform Denoised Speech <br \/> 6.3.24 Transform Speech  <\/td>\n<\/tr>\n<tr>\n<td><b>49<b><\/td>\n<td>6.3.25 Transform Multichannel Audio <br \/> 6.3.26 Video  <\/td>\n<\/tr>\n<tr>\n<td><b>50<b><\/td>\n<td>Annex 1  MPAI-wide terms and definitions  <\/td>\n<\/tr>\n<tr>\n<td><b>53<b><\/td>\n<td>Annex 2  Notices and Disclaimers Concerning MPAI Standards (Informative)  <\/td>\n<\/tr>\n<tr>\n<td><b>55<b><\/td>\n<td>Annex 3  Patent Declarations  <\/td>\n<\/tr>\n<tr>\n<td><b>56<b><\/td>\n<td>Annex 4  Examples (Informative) <br \/> A4.1 Audio Scene Geometry <br \/> A4.2 Damaged List <br \/> A4.3 Editing List  <\/td>\n<\/tr>\n<tr>\n<td><b>57<b><\/td>\n<td>A4.4 Irregularity File  <\/td>\n<\/tr>\n<tr>\n<td><b>58<b><\/td>\n<td>A4.5 Microphone Array Geometry  <\/td>\n<\/tr>\n<tr>\n<td><b>59<b><\/td>\n<td>A4.6 Speech Features 1 <br \/> A4.7 Speech Features 2  <\/td>\n<\/tr>\n<tr>\n<td><b>60<b><\/td>\n<td>Annex 5  AIW and AIM Metadata of CAE-EES <br \/> A5.1 AIW Metadata  <\/td>\n<\/tr>\n<tr>\n<td><b>62<b><\/td>\n<td>A5.2 AIM Metadata <br \/> A5.2.1 Speech Feature Analyser1  <\/td>\n<\/tr>\n<tr>\n<td><b>63<b><\/td>\n<td>A5.2.2 Speech Feature Analyser2  <\/td>\n<\/tr>\n<tr>\n<td><b>64<b><\/td>\n<td>A5.2.3 Emotion Feature Producer  <\/td>\n<\/tr>\n<tr>\n<td><b>65<b><\/td>\n<td>A5.2.4 Emotion Inserter1 <br \/> A5.2.5 Emotion Inserter2  <\/td>\n<\/tr>\n<tr>\n<td><b>67<b><\/td>\n<td>Annex 6  AIW and AIM Metadata of CAE-ARP <br \/> A6.1 AIW metadata  <\/td>\n<\/tr>\n<tr>\n<td><b>71<b><\/td>\n<td>A6.2 AIM metadata <br \/> A6.2.1 Audio Analyser  <\/td>\n<\/tr>\n<tr>\n<td><b>72<b><\/td>\n<td>A6.2.2 Video Analyser  <\/td>\n<\/tr>\n<tr>\n<td><b>73<b><\/td>\n<td>A6.2.3 Tape Irregularity classifier  <\/td>\n<\/tr>\n<tr>\n<td><b>74<b><\/td>\n<td>A6.2.4 Tape Audio Restoration  <\/td>\n<\/tr>\n<tr>\n<td><b>75<b><\/td>\n<td>A6.2.5 Packager  <\/td>\n<\/tr>\n<tr>\n<td><b>77<b><\/td>\n<td>Annex 7  AIW and AIM Metadata of CAE-SRS <br \/> A7.1 AIW metadata  <\/td>\n<\/tr>\n<tr>\n<td><b>79<b><\/td>\n<td>A7.2 AIM metadata <br \/> A7.2.1 Speech Model Creation <br \/> A7.2.2 Speech Synthesiser  <\/td>\n<\/tr>\n<tr>\n<td><b>80<b><\/td>\n<td>A7.2.3 Assembler  <\/td>\n<\/tr>\n<tr>\n<td><b>82<b><\/td>\n<td>Annex 8  AIW and AIM Metadata of CAE-EAE <br \/> A8.1 AIW Metadata  <\/td>\n<\/tr>\n<tr>\n<td><b>87<b><\/td>\n<td>A8.2 AIM Metadata <br \/> A8.2.1 Analysis Transform  <\/td>\n<\/tr>\n<tr>\n<td><b>88<b><\/td>\n<td>A8.2.2 Sound Field Description  <\/td>\n<\/tr>\n<tr>\n<td><b>89<b><\/td>\n<td>A8.2.3 Speech Detection and Separation  <\/td>\n<\/tr>\n<tr>\n<td><b>90<b><\/td>\n<td>A8.2.4 Noise Cancellation  <\/td>\n<\/tr>\n<tr>\n<td><b>91<b><\/td>\n<td>A8.2.5 Synthesis Transform  <\/td>\n<\/tr>\n<tr>\n<td><b>92<b><\/td>\n<td>A8.2.6 Packager  <\/td>\n<\/tr>\n<tr>\n<td><b>94<b><\/td>\n<td>Back Cover  <\/td>\n<\/tr>\n<\/table>\n","protected":false},"excerpt":{"rendered":"<p><b>IEEE Standard Adoption of Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI) Technical Specification Context-based Audio Enhanced (CAE) Version 1.4 (Published)<\/b><\/p>\n<table class=\"shortdes-table\" style=\"width: 100%\" border=\"0\" cellspacing=\"0\" cellpadding=\"0\">\n<tbody>\n<tr>\n<td>Published By<\/td>\n<td>Publication Date<\/td>\n<td>Number of Pages<\/td>\n<\/tr>\n<tr>\n<td><a href=\"https:\/\/pdfstandards.shop\/product-category\/publishers\/ieee\/\"><b>IEEE<\/b><\/a><\/td>\n<td>2022<\/td>\n<td>94<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"featured_media":438163,"template":"","meta":{"rank_math_lock_modified_date":false,"ep_exclude_from_search":false},"product_cat":[2644],"product_tag":[],"class_list":{"0":"post-438153","1":"product","2":"type-product","3":"status-publish","4":"has-post-thumbnail","6":"product_cat-ieee","8":"first","9":"instock","10":"sold-individually","11":"shipping-taxable","12":"purchasable","13":"product-type-simple"},"_links":{"self":[{"href":"https:\/\/pdfstandards.shop\/wp-json\/wp\/v2\/product\/438153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pdfstandards.shop\/wp-json\/wp\/v2\/product"}],"about":[{"href":"https:\/\/pdfstandards.shop\/wp-json\/wp\/v2\/types\/product"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/pdfstandards.shop\/wp-json\/wp\/v2\/media\/438163"}],"wp:attachment":[{"href":"https:\/\/pdfstandards.shop\/wp-json\/wp\/v2\/media?parent=438153"}],"wp:term":[{"taxonomy":"product_cat","embeddable":true,"href":"https:\/\/pdfstandards.shop\/wp-json\/wp\/v2\/product_cat?post=438153"},{"taxonomy":"product_tag","embeddable":true,"href":"https:\/\/pdfstandards.shop\/wp-json\/wp\/v2\/product_tag?post=438153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}