[getStats] Implement "media-source" audio levels, fixing Chrome bug.

Implements RTCAudioSourceStats members: - audioLevel - totalAudioEnergy - totalSamplesDuration In this CL description these are collectively referred to as the audio levels. The audio levels are removed from sending "track" stats (in Chrome, these are now reported as undefined instead of 0). Background: For sending tracks, audio levels were always reported as 0 in Chrome (https://crbug.com/736403), while audio levels were correctly reported for receiving tracks. This problem affected the standard getStats() but not the legacy getStats(), blocking some people from migrating. This was likely not a problem in native third_party/webrtc code because the delivery of audio frames from device to send-stream uses a different code path outside of chromium. A recent PR (https://github.com/w3c/webrtc-stats/pull/451) moved the send-side audio levels to the RTCAudioSourceStats, while keeping the receive-side audio levels on the "track" stats. This allows an implementation to report the audio levels even if samples are not sent onto the network (such as if an ICE connection has not been established yet), reflecting some of the current implementation. Changes: 1. Audio levels are added to RTCAudioSourceStats. Send-side audio "track" stats are left undefined. Receive-side audio "track" stats are not changed in this CL and continue to work. 2. Audio level computation is moved from the AudioState and AudioTransportImpl to the AudioSendStream. This is because a) the AudioTransportImpl::RecordedDataIsAvailable() code path is not exercised in chromium, and b) audio levels should, per-spec, not be calculated on a per-call basis, for which the AudioState is defined. 3. The audio level computation is now performed in AudioSendStream::SendAudioData(), a code path used by both native and chromium code. 4. Comments are added to document behavior of existing code, such as AudioLevel and AudioSendStream::SendAudioData(). Note: In this CL, just like before this CL, audio level is only calculated after an AudioSendStream has been created. This means that before an O/A negotiation, audio levels are unavailable. According to spec, if we have an audio source, we should have audio levels. An immediate solution to this would have been to calculate the audio level at pc/rtp_sender.cc. The problem is that the LocalAudioSinkAdapter::OnData() code path, while exercised in chromium, is not exercised in native code. The issue of calculating audio levels on a per-source bases rather than on a per-send stream basis is left to https://crbug.com/webrtc/10771, an existing "media-source" bug. This CL can be verified manually in Chrome at: https://codepen.io/anon/pen/vqRGyq Bug: chromium:736403, webrtc:10771 Change-Id: I8036cd9984f3b187c3177470a8c0d6670a201a5a Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/143789 Reviewed-by: Oskar Sundbom <ossu@webrtc.org> Reviewed-by: Stefan Holmer <stefan@webrtc.org> Commit-Queue: Henrik Boström <hbos@webrtc.org> Cr-Commit-Position: refs/heads/master@{#28480}
author: Henrik Boström <hbos@webrtc.org> 2019-07-03 17:11:10 +0200
committer: Commit Bot <commit-bot@chromium.org> 2019-07-04 08:13:45 +0000
commit: d2c336f892f7c92eecfb15de5b55dde0c1b757fb (patch)
tree: d688c8351e1e2442332e4a799ca58bea1808f2d4
parent: e8fbc5d7026e1d49121b1410318c1324e6bfedcd (diff)
download: webrtc-d2c336f892f7c92eecfb15de5b55dde0c1b757fb.tar.gz
21 files changed, 223 insertions, 130 deletions
diff --git a/api/stats/rtcstats_objects.h b/api/stats/rtcstats_objects.h
index 1f8b973b2e..ebd79f4eb9 100644
--- a/api/stats/rtcstats_objects.h
+++ b/api/stats/rtcstats_objects.h
@@ -317,12 +317,12 @@ class RTC_EXPORT RTCMediaStreamTrackStats final : public RTCStats {
   // TODO(hbos): Not collected by |RTCStatsCollector|. crbug.com/659137
   RTCStatsMember<uint32_t> full_frames_lost;
   // Audio-only members
-  RTCStatsMember<double> audio_level;
-  RTCStatsMember<double> total_audio_energy;
+  RTCStatsMember<double> audio_level;         // Receive-only
+  RTCStatsMember<double> total_audio_energy;  // Receive-only
   RTCStatsMember<double> echo_return_loss;
   RTCStatsMember<double> echo_return_loss_enhancement;
   RTCStatsMember<uint64_t> total_samples_received;
-  RTCStatsMember<double> total_samples_duration;
+  RTCStatsMember<double> total_samples_duration;  // Receive-only
   RTCStatsMember<uint64_t> concealed_samples;
   RTCStatsMember<uint64_t> silent_concealed_samples;
   RTCStatsMember<uint64_t> concealment_events;
@@ -548,6 +548,10 @@ class RTC_EXPORT RTCAudioSourceStats final : public RTCMediaSourceStats {
   RTCAudioSourceStats(std::string&& id, int64_t timestamp_us);
   RTCAudioSourceStats(const RTCAudioSourceStats& other);
   ~RTCAudioSourceStats() override;
+
+  RTCStatsMember<double> audio_level;
+  RTCStatsMember<double> total_audio_energy;
+  RTCStatsMember<double> total_samples_duration;
 };
 
 // https://w3c.github.io/webrtc-stats/#dom-rtcvideosourcestats
diff --git a/audio/BUILD.gn b/audio/BUILD.gn
index 7539f37793..ff38da0d80 100644
--- a/audio/BUILD.gn
+++ b/audio/BUILD.gn
@@ -146,6 +146,7 @@ if (rtc_include_tests) {
       "../modules/audio_device:audio_device_impl",  # For TestAudioDeviceModule
       "../modules/audio_device:mock_audio_device",
       "../modules/audio_mixer:audio_mixer_impl",
+      "../modules/audio_mixer:audio_mixer_test_utils",
       "../modules/audio_processing:audio_processing_statistics",
       "../modules/audio_processing:mocks",
       "../modules/pacing",
diff --git a/audio/audio_level.cc b/audio/audio_level.cc
index 63b80a5d74..d26e949ccc 100644
--- a/audio/audio_level.cc
+++ b/audio/audio_level.cc
@@ -22,12 +22,21 @@ AudioLevel::AudioLevel()
 
 AudioLevel::~AudioLevel() {}
 
+void AudioLevel::Reset() {
+  rtc::CritScope cs(&crit_sect_);
+  abs_max_ = 0;
+  count_ = 0;
+  current_level_full_range_ = 0;
+  total_energy_ = 0.0;
+  total_duration_ = 0.0;
+}
+
 int16_t AudioLevel::LevelFullRange() const {
   rtc::CritScope cs(&crit_sect_);
   return current_level_full_range_;
 }
 
-void AudioLevel::Clear() {
+void AudioLevel::ResetLevelFullRange() {
   rtc::CritScope cs(&crit_sect_);
   abs_max_ = 0;
   count_ = 0;
@@ -60,7 +69,10 @@ void AudioLevel::ComputeLevel(const AudioFrame& audioFrame, double duration) {
   if (abs_value > abs_max_)
     abs_max_ = abs_value;
 
-  // Update level approximately 10 times per second
+  // Update level approximately 9 times per second, assuming audio frame
+  // duration is approximately 10 ms. (The update frequency is every
+  // 11th (= |kUpdateFrequency+1|) call: 1000/(11*10)=9.09..., we should
+  // probably change this behavior, see https://crbug.com/webrtc/10784).
   if (count_++ == kUpdateFrequency) {
     current_level_full_range_ = abs_max_;
 
diff --git a/audio/audio_level.h b/audio/audio_level.h
index bb04cc06c2..430edb1703 100644
--- a/audio/audio_level.h
+++ b/audio/audio_level.h
@@ -19,17 +19,35 @@ namespace webrtc {
 class AudioFrame;
 namespace voe {
 
+// This class is thread-safe. However, TotalEnergy() and TotalDuration() are
+// related, so if you call ComputeLevel() on a different thread than you read
+// these values, you still need to use lock to read them as a pair.
 class AudioLevel {
  public:
   AudioLevel();
   ~AudioLevel();
+  void Reset();
 
+  // Returns the current audio level linearly [0,32767], which gets updated
+  // every "kUpdateFrequency+1" call to ComputeLevel() based on the maximum
+  // audio level of any audio frame, decaying by a factor of 1/4 each time
+  // LevelFullRange() gets updated.
   // Called on "API thread(s)" from APIs like VoEBase::CreateChannel(),
-  // VoEBase::StopSend()
+  // VoEBase::StopSend().
   int16_t LevelFullRange() const;
-  void Clear();
+  void ResetLevelFullRange();
   // See the description for "totalAudioEnergy" in the WebRTC stats spec
-  // (https://w3c.github.io/webrtc-stats/#dom-rtcmediastreamtrackstats-totalaudioenergy)
+  // (https://w3c.github.io/webrtc-stats/#dom-rtcaudiohandlerstats-totalaudioenergy)
+  // In our implementation, the total audio energy increases by the
+  // energy-equivalent of LevelFullRange() at the time of ComputeLevel(), rather
+  // than the energy of the samples in that specific audio frame. As a result,
+  // we may report a higher audio energy and audio level than the spec mandates.
+  // TODO(https://crbug.com/webrtc/10784): We should either do what the spec
+  // says or update the spec to match our implementation. If we want to have a
+  // decaying audio level we should probably update both the spec and the
+  // implementation to reduce the complexity of the definition. If we want to
+  // continue to have decaying audio we should have unittests covering the
+  // behavior of the decay.
   double TotalEnergy() const;
   double TotalDuration() const;
 
diff --git a/audio/audio_send_stream.cc b/audio/audio_send_stream.cc
index 24f6fe1abb..9190441678 100644
--- a/audio/audio_send_stream.cc
+++ b/audio/audio_send_stream.cc
@@ -362,6 +362,21 @@ void AudioSendStream::Stop() {
 
 void AudioSendStream::SendAudioData(std::unique_ptr<AudioFrame> audio_frame) {
   RTC_CHECK_RUNS_SERIALIZED(&audio_capture_race_checker_);
+  RTC_DCHECK_GT(audio_frame->sample_rate_hz_, 0);
+  double duration = static_cast<double>(audio_frame->samples_per_channel_) /
+                    audio_frame->sample_rate_hz_;
+  {
+    // Note: SendAudioData() passes the frame further down the pipeline and it
+    // may eventually get sent. But this method is invoked even if we are not
+    // connected, as long as we have an AudioSendStream (created as a result of
+    // an O/A exchange). This means that we are calculating audio levels whether
+    // or not we are sending samples.
+    // TODO(https://crbug.com/webrtc/10771): All "media-source" related stats
+    // should move from send-streams to the local audio sources or tracks; a
+    // send-stream should not be required to read the microphone audio levels.
+    rtc::CritScope cs(&audio_level_lock_);
+    audio_level_.ComputeLevel(*audio_frame, duration);
+  }
   channel_send_->ProcessAndEncodeAudio(std::move(audio_frame));
 }
 
@@ -423,10 +438,12 @@ webrtc::AudioSendStream::Stats AudioSendStream::GetStats(
     }
   }
 
-  AudioState::Stats input_stats = audio_state()->GetAudioInputStats();
-  stats.audio_level = input_stats.audio_level;
-  stats.total_input_energy = input_stats.total_energy;
-  stats.total_input_duration = input_stats.total_duration;
+  {
+    rtc::CritScope cs(&audio_level_lock_);
+    stats.audio_level = audio_level_.LevelFullRange();
+    stats.total_input_energy = audio_level_.TotalEnergy();
+    stats.total_input_duration = audio_level_.TotalDuration();
+  }
 
   stats.typing_noise_detected = audio_state()->typing_noise_detected();
   stats.ana_statistics = channel_send_->GetANAStatistics();
diff --git a/audio/audio_send_stream.h b/audio/audio_send_stream.h
index 9796e80c90..fd65296ef4 100644
--- a/audio/audio_send_stream.h
+++ b/audio/audio_send_stream.h
@@ -14,6 +14,7 @@
 #include <memory>
 #include <vector>
 
+#include "audio/audio_level.h"
 #include "audio/channel_send.h"
 #include "audio/transport_feedback_packet_loss_tracker.h"
 #include "call/audio_send_stream.h"
@@ -160,6 +161,10 @@ class AudioSendStream final : public webrtc::AudioSendStream,
   int encoder_sample_rate_hz_ = 0;
   size_t encoder_num_channels_ = 0;
   bool sending_ = false;
+  rtc::CriticalSection audio_level_lock_;
+  // Keeps track of audio level, total audio energy and total samples duration.
+  // https://w3c.github.io/webrtc-stats/#dom-rtcaudiohandlerstats-totalaudioenergy
+  webrtc::voe::AudioLevel audio_level_;
 
   BitrateAllocatorInterface* const bitrate_allocator_
       RTC_GUARDED_BY(worker_queue_);
diff --git a/audio/audio_send_stream_unittest.cc b/audio/audio_send_stream_unittest.cc
index 453175595b..022516ad87 100644
--- a/audio/audio_send_stream_unittest.cc
+++ b/audio/audio_send_stream_unittest.cc
@@ -23,6 +23,7 @@
 #include "logging/rtc_event_log/mock/mock_rtc_event_log.h"
 #include "modules/audio_device/include/mock_audio_device.h"
 #include "modules/audio_mixer/audio_mixer_impl.h"
+#include "modules/audio_mixer/sine_wave_generator.h"
 #include "modules/audio_processing/include/audio_processing_statistics.h"
 #include "modules/audio_processing/include/mock_audio_processing.h"
 #include "modules/rtp_rtcp/mocks/mock_rtcp_bandwidth_observer.h"
@@ -40,6 +41,7 @@ namespace test {
 namespace {
 
 using ::testing::_;
+using ::testing::AnyNumber;
 using ::testing::Eq;
 using ::testing::Field;
 using ::testing::Invoke;
@@ -47,6 +49,8 @@ using ::testing::Ne;
 using ::testing::Return;
 using ::testing::StrEq;
 
+static const float kTolerance = 0.0001f;
+
 const uint32_t kSsrc = 1234;
 const char* kCName = "foo_name";
 const int kAudioLevelId = 2;
@@ -317,6 +321,24 @@ struct ConfigHelper {
   TaskQueueForTest worker_queue_;
   std::unique_ptr<AudioEncoder> audio_encoder_;
 };
+
+// The audio level ranges linearly [0,32767].
+std::unique_ptr<AudioFrame> CreateAudioFrame1kHzSineWave(int16_t audio_level,
+                                                         int duration_ms,
+                                                         int sample_rate_hz,
+                                                         size_t num_channels) {
+  size_t samples_per_channel = sample_rate_hz / (1000 / duration_ms);
+  std::vector<int16_t> audio_data(samples_per_channel * num_channels, 0);
+  std::unique_ptr<AudioFrame> audio_frame = absl::make_unique<AudioFrame>();
+  audio_frame->UpdateFrame(0 /* RTP timestamp */, &audio_data[0],
+                           samples_per_channel, sample_rate_hz,
+                           AudioFrame::SpeechType::kNormalSpeech,
+                           AudioFrame::VADActivity::kVadUnknown, num_channels);
+  SineWaveGenerator wave_generator(1000.0, audio_level);
+  wave_generator.GenerateNextFrame(audio_frame.get());
+  return audio_frame;
+}
+
 }  // namespace
 
 TEST(AudioSendStreamTest, ConfigToString) {
@@ -415,6 +437,46 @@ TEST(AudioSendStreamTest, GetStats) {
   EXPECT_FALSE(stats.typing_noise_detected);
 }
 
+TEST(AudioSendStreamTest, GetStatsAudioLevel) {
+  ConfigHelper helper(false, true);
+  auto send_stream = helper.CreateAudioSendStream();
+  helper.SetupMockForGetStats();
+  EXPECT_CALL(*helper.channel_send(), ProcessAndEncodeAudioForMock(_))
+      .Times(AnyNumber());
+
+  constexpr int kSampleRateHz = 48000;
+  constexpr size_t kNumChannels = 1;
+
+  constexpr int16_t kSilentAudioLevel = 0;
+  constexpr int16_t kMaxAudioLevel = 32767;  // Audio level is [0,32767].
+  constexpr int kAudioFrameDurationMs = 10;
+
+  // Process 10 audio frames (100 ms) of silence. After this, on the next
+  // (11-th) frame, the audio level will be updated with the maximum audio level
+  // of the first 11 frames. See AudioLevel.
+  for (size_t i = 0; i < 10; ++i) {
+    send_stream->SendAudioData(CreateAudioFrame1kHzSineWave(
+        kSilentAudioLevel, kAudioFrameDurationMs, kSampleRateHz, kNumChannels));
+  }
+  AudioSendStream::Stats stats = send_stream->GetStats();
+  EXPECT_EQ(kSilentAudioLevel, stats.audio_level);
+  EXPECT_NEAR(0.0f, stats.total_input_energy, kTolerance);
+  EXPECT_NEAR(0.1f, stats.total_input_duration, kTolerance);  // 100 ms = 0.1 s
+
+  // Process 10 audio frames (100 ms) of maximum audio level.
+  // Note that AudioLevel updates the audio level every 11th frame, processing
+  // 10 frames above was needed to see a non-zero audio level here.
+  for (size_t i = 0; i < 10; ++i) {
+    send_stream->SendAudioData(CreateAudioFrame1kHzSineWave(
+        kMaxAudioLevel, kAudioFrameDurationMs, kSampleRateHz, kNumChannels));
+  }
+  stats = send_stream->GetStats();
+  EXPECT_EQ(kMaxAudioLevel, stats.audio_level);
+  // Energy increases by energy*duration, where energy is audio level in [0,1].
+  EXPECT_NEAR(0.1f, stats.total_input_energy, kTolerance);    // 0.1 s of max
+  EXPECT_NEAR(0.2f, stats.total_input_duration, kTolerance);  // 200 ms = 0.2 s
+}
+
 TEST(AudioSendStreamTest, SendCodecAppliesAudioNetworkAdaptor) {
   ConfigHelper helper(false, true);
   helper.config().send_codec_spec =
diff --git a/audio/audio_state.cc b/audio/audio_state.cc
index edba0cfff7..52c4504fb7 100644
--- a/audio/audio_state.cc
+++ b/audio/audio_state.cc
@@ -151,18 +151,6 @@ void AudioState::SetRecording(bool enabled) {
   }
 }
 
-AudioState::Stats AudioState::GetAudioInputStats() const {
-  RTC_DCHECK(thread_checker_.IsCurrent());
-  const voe::AudioLevel& audio_level = audio_transport_.audio_level();
-  Stats result;
-  result.audio_level = audio_level.LevelFullRange();
-  RTC_DCHECK_LE(0, result.audio_level);
-  RTC_DCHECK_GE(32767, result.audio_level);
-  result.total_energy = audio_level.TotalEnergy();
-  result.total_duration = audio_level.TotalDuration();
-  return result;
-}
-
 void AudioState::SetStereoChannelSwapping(bool enable) {
   RTC_DCHECK(thread_checker_.IsCurrent());
   audio_transport_.SetStereoChannelSwapping(enable);
diff --git a/audio/audio_state.h b/audio/audio_state.h
index 60250da89a..15d1641f70 100644
--- a/audio/audio_state.h
+++ b/audio/audio_state.h
@@ -41,7 +41,6 @@ class AudioState : public webrtc::AudioState {
   void SetPlayout(bool enabled) override;
   void SetRecording(bool enabled) override;
 
-  Stats GetAudioInputStats() const override;
   void SetStereoChannelSwapping(bool enable) override;
 
   AudioDeviceModule* audio_device_module() {
diff --git a/audio/audio_state_unittest.cc b/audio/audio_state_unittest.cc
index ed5ca223d5..61db5d94ca 100644
--- a/audio/audio_state_unittest.cc
+++ b/audio/audio_state_unittest.cc
@@ -56,13 +56,6 @@ class FakeAudioSource : public AudioMixer::Source {
                AudioFrameInfo(int sample_rate_hz, AudioFrame* audio_frame));
 };
 
-std::vector<int16_t> Create10msSilentTestData(int sample_rate_hz,
-                                              size_t num_channels) {
-  const int samples_per_channel = sample_rate_hz / 100;
-  std::vector<int16_t> audio_data(samples_per_channel * num_channels, 0);
-  return audio_data;
-}
-
 std::vector<int16_t> Create10msTestData(int sample_rate_hz,
                                         size_t num_channels) {
   const int samples_per_channel = sample_rate_hz / 100;
@@ -223,43 +216,6 @@ TEST(AudioStateTest, EnableChannelSwap) {
   audio_state->RemoveSendingStream(&stream);
 }
 
-TEST(AudioStateTest, InputLevelStats) {
-  constexpr int kSampleRate = 16000;
-  constexpr size_t kNumChannels = 1;
-
-  ConfigHelper helper;
-  rtc::scoped_refptr<internal::AudioState> audio_state(
-      new rtc::RefCountedObject<internal::AudioState>(helper.config()));
-
-  // Push a silent buffer -> Level stats should be zeros except for duration.
-  {
-    auto audio_data = Create10msSilentTestData(kSampleRate, kNumChannels);
-    uint32_t new_mic_level = 667;
-    audio_state->audio_transport()->RecordedDataIsAvailable(
-        &audio_data[0], kSampleRate / 100, kNumChannels * 2, kNumChannels,
-        kSampleRate, 0, 0, 0, false, new_mic_level);
-    auto stats = audio_state->GetAudioInputStats();
-    EXPECT_EQ(0, stats.audio_level);
-    EXPECT_THAT(stats.total_energy, ::testing::DoubleEq(0.0));
-    EXPECT_THAT(stats.total_duration, ::testing::DoubleEq(0.01));
-  }
-
-  // Push 10 non-silent buffers -> Level stats should be non-zero.
-  {
-    auto audio_data = Create10msTestData(kSampleRate, kNumChannels);
-    uint32_t new_mic_level = 667;
-    for (int i = 0; i < 10; ++i) {
-      audio_state->audio_transport()->RecordedDataIsAvailable(
-          &audio_data[0], kSampleRate / 100, kNumChannels * 2, kNumChannels,
-          kSampleRate, 0, 0, 0, false, new_mic_level);
-    }
-    auto stats = audio_state->GetAudioInputStats();
-    EXPECT_EQ(32767, stats.audio_level);
-    EXPECT_THAT(stats.total_energy, ::testing::DoubleEq(0.01));
-    EXPECT_THAT(stats.total_duration, ::testing::DoubleEq(0.11));
-  }
-}
-
 TEST(AudioStateTest,
      QueryingTransportForAudioShouldResultInGetAudioCallOnMixerSource) {
   ConfigHelper helper;
diff --git a/audio/audio_transport_impl.cc b/audio/audio_transport_impl.cc
index 2e6ff52108..aca6f9baf6 100644
--- a/audio/audio_transport_impl.cc
+++ b/audio/audio_transport_impl.cc
@@ -142,10 +142,6 @@ int32_t AudioTransportImpl::RecordedDataIsAvailable(
     }
   }
 
-  // Measure audio level of speech after all processing.
-  double sample_duration = static_cast<double>(number_of_frames) / sample_rate;
-  audio_level_.ComputeLevel(*audio_frame, sample_duration);
-
   // Copy frame and push to each sending stream. The copy is required since an
   // encoding task will be posted internally to each stream.
   {
diff --git a/audio/audio_transport_impl.h b/audio/audio_transport_impl.h
index 4c244a1c13..8a74d98adf 100644
--- a/audio/audio_transport_impl.h
+++ b/audio/audio_transport_impl.h
@@ -15,7 +15,6 @@
 
 #include "api/audio/audio_mixer.h"
 #include "api/scoped_refptr.h"
-#include "audio/audio_level.h"
 #include "common_audio/resampler/include/push_resampler.h"
 #include "modules/audio_device/include/audio_device.h"
 #include "modules/audio_processing/include/audio_processing.h"
@@ -66,7 +65,6 @@ class AudioTransportImpl : public AudioTransport {
                             size_t send_num_channels);
   void SetStereoChannelSwapping(bool enable);
   bool typing_noise_detected() const;
-  const voe::AudioLevel& audio_level() const { return audio_level_; }
 
  private:
   // Shared.
@@ -80,7 +78,6 @@ class AudioTransportImpl : public AudioTransport {
   bool typing_noise_detected_ RTC_GUARDED_BY(capture_lock_) = false;
   bool swap_stereo_channels_ RTC_GUARDED_BY(capture_lock_) = false;
   PushResampler<int16_t> capture_resampler_;
-  voe::AudioLevel audio_level_;
   TypingDetection typing_detection_;
 
   // Render side.
diff --git a/audio/channel_receive.cc b/audio/channel_receive.cc
index 8b9dd2d7f2..971a40a19b 100644
--- a/audio/channel_receive.cc
+++ b/audio/channel_receive.cc
@@ -483,7 +483,7 @@ ChannelReceive::ChannelReceive(
       jitter_buffer_enable_rtx_handling;
   audio_coding_.reset(AudioCodingModule::Create(acm_config));
 
-  _outputAudioLevel.Clear();
+  _outputAudioLevel.ResetLevelFullRange();
 
   rtp_receive_statistics_->EnableRetransmitDetection(remote_ssrc_, true);
   RtpRtcp::Configuration configuration;
@@ -546,7 +546,7 @@ void ChannelReceive::StopPlayout() {
   RTC_DCHECK(worker_thread_checker_.IsCurrent());
   rtc::CritScope lock(&playing_lock_);
   playing_ = false;
-  _outputAudioLevel.Clear();
+  _outputAudioLevel.ResetLevelFullRange();
 }
 
 absl::optional<std::pair<int, SdpAudioFormat>>
diff --git a/call/audio_send_stream.h b/call/audio_send_stream.h
index d8fdddb764..f479492b9d 100644
--- a/call/audio_send_stream.h
+++ b/call/audio_send_stream.h
@@ -56,7 +56,7 @@ class AudioSendStream {
     int32_t ext_seqnum = -1;
     int32_t jitter_ms = -1;
     int64_t rtt_ms = -1;
-    int32_t audio_level = -1;
+    int16_t audio_level = 0;
     // See description of "totalAudioEnergy" in the WebRTC stats spec:
     // https://w3c.github.io/webrtc-stats/#dom-rtcmediastreamtrackstats-totalaudioenergy
     double total_input_energy = 0.0;
diff --git a/call/audio_state.h b/call/audio_state.h
index 18cbd48ca9..89267c5ab3 100644
--- a/call/audio_state.h
+++ b/call/audio_state.h
@@ -39,15 +39,6 @@ class AudioState : public rtc::RefCountInterface {
     rtc::scoped_refptr<webrtc::AudioDeviceModule> audio_device_module;
   };
 
-  struct Stats {
-    // Audio peak level (max(abs())), linearly on the interval [0,32767].
-    int32_t audio_level = -1;
-    // See:
-    // https://w3c.github.io/webrtc-stats/#dom-rtcmediastreamtrackstats-totalaudioenergy
-    double total_energy = 0.0f;
-    double total_duration = 0.0f;
-  };
-
   virtual AudioProcessing* audio_processing() = 0;
   virtual AudioTransport* audio_transport() = 0;
 
@@ -62,7 +53,6 @@ class AudioState : public rtc::RefCountInterface {
   // packets will be encoded or transmitted.
   virtual void SetRecording(bool enabled) = 0;
 
-  virtual Stats GetAudioInputStats() const = 0;
   virtual void SetStereoChannelSwapping(bool enable) = 0;
 
   static rtc::scoped_refptr<AudioState> Create(
diff --git a/media/base/media_channel.h b/media/base/media_channel.h
index 80a7e11024..8c4e8b8e2c 100644
--- a/media/base/media_channel.h
+++ b/media/base/media_channel.h
@@ -468,6 +468,7 @@ struct VoiceSenderInfo : public MediaSenderInfo {
   ~VoiceSenderInfo();
   int ext_seqnum = 0;
   int jitter_ms = 0;
+  // Current audio level, expressed linearly [0,32767].
   int audio_level = 0;
   // See description of "totalAudioEnergy" in the WebRTC stats spec:
   // https://w3c.github.io/webrtc-stats/#dom-rtcmediastreamtrackstats-totalaudioenergy
diff --git a/modules/audio_mixer/BUILD.gn b/modules/audio_mixer/BUILD.gn
index 70729db150..7354447fbf 100644
--- a/modules/audio_mixer/BUILD.gn
+++ b/modules/audio_mixer/BUILD.gn
@@ -76,13 +76,10 @@ rtc_static_library("audio_frame_manipulator") {
 }
 
 if (rtc_include_tests) {
-  rtc_source_set("audio_mixer_unittests") {
+  rtc_source_set("audio_mixer_test_utils") {
     testonly = true
 
     sources = [
-      "audio_frame_manipulator_unittest.cc",
-      "audio_mixer_impl_unittest.cc",
-      "frame_combiner_unittest.cc",
       "gain_change_calculator.cc",
       "gain_change_calculator.h",
       "sine_wave_generator.cc",
@@ -94,6 +91,25 @@ if (rtc_include_tests) {
       ":audio_mixer_impl",
       "../../api:array_view",
       "../../api/audio:audio_frame_api",
+      "../../rtc_base:checks",
+      "../../rtc_base:rtc_base_approved",
+    ]
+  }
+
+  rtc_source_set("audio_mixer_unittests") {
+    testonly = true
+
+    sources = [
+      "audio_frame_manipulator_unittest.cc",
+      "audio_mixer_impl_unittest.cc",
+      "frame_combiner_unittest.cc",
+    ]
+
+    deps = [
+      ":audio_frame_manipulator",
+      ":audio_mixer_impl",
+      ":audio_mixer_test_utils",
+      "../../api:array_view",
       "../../api/audio:audio_mixer_api",
       "../../audio/utility:audio_frame_operations",
       "../../rtc_base:checks",
diff --git a/pc/rtc_stats_collector.cc b/pc/rtc_stats_collector.cc
index ec917aec72..0ccfd18e4a 100644
--- a/pc/rtc_stats_collector.cc
+++ b/pc/rtc_stats_collector.cc
@@ -551,13 +551,6 @@ ProduceMediaStreamTrackStatsFromVoiceSenderInfo(
                                                  attachment_id);
   audio_track_stats->remote_source = false;
   audio_track_stats->detached = false;
-  if (voice_sender_info.audio_level >= 0) {
-    audio_track_stats->audio_level = DoubleAudioLevelFromIntAudioLevel(
-        voice_sender_info.audio_level);
-  }
-  audio_track_stats->total_audio_energy = voice_sender_info.total_input_energy;
-  audio_track_stats->total_samples_duration =
-      voice_sender_info.total_input_duration;
   if (voice_sender_info.apm_statistics.echo_return_loss) {
     audio_track_stats->echo_return_loss =
         *voice_sender_info.apm_statistics.echo_return_loss;
@@ -1395,18 +1388,38 @@ void RTCStatsCollector::ProduceMediaSourceStats_s(
       const auto& track = sender_internal->track();
       if (!track)
         continue;
-      // TODO(hbos): The same track could be attached to multiple senders which
-      // should result in multiple senders referencing the same media source
-      // stats. When all media source related metrics are moved to the track's
-      // source (e.g. input frame rate is moved from cricket::VideoSenderInfo to
-      // VideoTrackSourceInterface::Stats), don't create separate media source
-      // stats objects on a per-attachment basis.
+      // TODO(https://crbug.com/webrtc/10771): The same track could be attached
+      // to multiple senders which should result in multiple senders referencing
+      // the same media-source stats. When all media source related metrics are
+      // moved to the track's source (e.g. input frame rate is moved from
+      // cricket::VideoSenderInfo to VideoTrackSourceInterface::Stats and audio
+      // levels are moved to the corresponding audio track/source object), don't
+      // create separate media source stats objects on a per-attachment basis.
       std::unique_ptr<RTCMediaSourceStats> media_source_stats;
       if (track->kind() == MediaStreamTrackInterface::kAudioKind) {
-        media_source_stats = absl::make_unique<RTCAudioSourceStats>(
+        auto audio_source_stats = absl::make_unique<RTCAudioSourceStats>(
             RTCMediaSourceStatsIDFromKindAndAttachment(
                 cricket::MEDIA_TYPE_AUDIO, sender_internal->AttachmentId()),
             timestamp_us);
+        // TODO(https://crbug.com/webrtc/10771): We shouldn't need to have an
+        // SSRC assigned (there shouldn't need to exist a send-stream, created
+        // by an O/A exchange) in order to read audio media-source stats.
+        // TODO(https://crbug.com/webrtc/8694): SSRC 0 shouldn't be a magic
+        // value indicating no SSRC.
+        if (sender_internal->ssrc() != 0) {
+          auto* voice_sender_info =
+              track_media_info_map->GetVoiceSenderInfoBySsrc(
+                  sender_internal->ssrc());
+          if (voice_sender_info) {
+            audio_source_stats->audio_level = DoubleAudioLevelFromIntAudioLevel(
+                voice_sender_info->audio_level);
+            audio_source_stats->total_audio_energy =
+                voice_sender_info->total_input_energy;
+            audio_source_stats->total_samples_duration =
+                voice_sender_info->total_input_duration;
+          }
+        }
+        media_source_stats = std::move(audio_source_stats);
       } else {
         RTC_DCHECK_EQ(MediaStreamTrackInterface::kVideoKind, track->kind());
         auto video_source_stats = absl::make_unique<RTCVideoSourceStats>(
@@ -1420,15 +1433,18 @@ void RTCStatsCollector::ProduceMediaSourceStats_s(
           video_source_stats->width = source_stats.input_width;
           video_source_stats->height = source_stats.input_height;
         }
-        // TODO(hbos): Source stats should not depend on whether or not we are
-        // connected/have an SSRC assigned. Related to
-        // https://crbug.com/webrtc/8694 (using ssrc 0 to indicate "none").
+        // TODO(https://crbug.com/webrtc/10771): We shouldn't need to have an
+        // SSRC assigned (there shouldn't need to exist a send-stream, created
+        // by an O/A exchange) in order to get framesPerSecond.
+        // TODO(https://crbug.com/webrtc/8694): SSRC 0 shouldn't be a magic
+        // value indicating no SSRC.
         if (sender_internal->ssrc() != 0) {
-          auto* sender_info = track_media_info_map->GetVideoSenderInfoBySsrc(
-              sender_internal->ssrc());
-          if (sender_info) {
+          auto* video_sender_info =
+              track_media_info_map->GetVideoSenderInfoBySsrc(
+                  sender_internal->ssrc());
+          if (video_sender_info) {
             video_source_stats->frames_per_second =
-                sender_info->framerate_input;
+                video_sender_info->framerate_input;
           }
         }
         media_source_stats = std::move(video_source_stats);
diff --git a/pc/rtc_stats_collector_unittest.cc b/pc/rtc_stats_collector_unittest.cc
index 963a3bc829..02f6654694 100644
--- a/pc/rtc_stats_collector_unittest.cc
+++ b/pc/rtc_stats_collector_unittest.cc
@@ -1438,9 +1438,6 @@ TEST_F(RTCStatsCollectorTest,
   cricket::VoiceSenderInfo voice_sender_info_ssrc1;
   voice_sender_info_ssrc1.local_stats.push_back(cricket::SsrcSenderInfo());
   voice_sender_info_ssrc1.local_stats[0].ssrc = 1;
-  voice_sender_info_ssrc1.audio_level = 32767;
-  voice_sender_info_ssrc1.total_input_energy = 0.25;
-  voice_sender_info_ssrc1.total_input_duration = 0.5;
   voice_sender_info_ssrc1.apm_statistics.echo_return_loss = 42.0;
   voice_sender_info_ssrc1.apm_statistics.echo_return_loss_enhancement = 52.0;
 
@@ -1471,9 +1468,6 @@ TEST_F(RTCStatsCollectorTest,
   expected_local_audio_track_ssrc1.remote_source = false;
   expected_local_audio_track_ssrc1.ended = true;
   expected_local_audio_track_ssrc1.detached = false;
-  expected_local_audio_track_ssrc1.audio_level = 1.0;
-  expected_local_audio_track_ssrc1.total_audio_energy = 0.25;
-  expected_local_audio_track_ssrc1.total_samples_duration = 0.5;
   expected_local_audio_track_ssrc1.echo_return_loss = 42.0;
   expected_local_audio_track_ssrc1.echo_return_loss_enhancement = 52.0;
   ASSERT_TRUE(report->Get(expected_local_audio_track_ssrc1.id()))
@@ -2219,6 +2213,9 @@ TEST_F(RTCStatsCollectorTest, RTCAudioSourceStatsCollectedForSenderWithTrack) {
   voice_media_info.senders.push_back(cricket::VoiceSenderInfo());
   voice_media_info.senders[0].local_stats.push_back(cricket::SsrcSenderInfo());
   voice_media_info.senders[0].local_stats[0].ssrc = kSsrc;
+  voice_media_info.senders[0].audio_level = 32767;  // [0,32767]
+  voice_media_info.senders[0].total_input_energy = 2.0;
+  voice_media_info.senders[0].total_input_duration = 3.0;
   auto* voice_media_channel = pc_->AddVoiceChannel("AudioMid", "TransportName");
   voice_media_channel->SetStats(voice_media_info);
   stats_->SetupLocalTrackAndSender(cricket::MEDIA_TYPE_AUDIO,
@@ -2231,6 +2228,9 @@ TEST_F(RTCStatsCollectorTest, RTCAudioSourceStatsCollectedForSenderWithTrack) {
                                      report->timestamp_us());
   expected_audio.track_identifier = "LocalAudioTrackID";
   expected_audio.kind = "audio";
+  expected_audio.audio_level = 1.0;  // [0,1]
+  expected_audio.total_audio_energy = 2.0;
+  expected_audio.total_samples_duration = 3.0;
 
   ASSERT_TRUE(report->Get(expected_audio.id()));
   EXPECT_EQ(report->Get(expected_audio.id())->cast_to<RTCAudioSourceStats>(),
diff --git a/pc/rtc_stats_integrationtest.cc b/pc/rtc_stats_integrationtest.cc
index a05fa0e61a..adb986dac1 100644
--- a/pc/rtc_stats_integrationtest.cc
+++ b/pc/rtc_stats_integrationtest.cc
@@ -647,8 +647,13 @@ class RTCStatsReportVerifier {
             media_stream_track.jitter_buffer_delay);
         verifier.TestMemberIsNonNegative<uint64_t>(
             media_stream_track.jitter_buffer_emitted_count);
-        verifier.TestMemberIsNonNegative<uint64_t>(
+        verifier.TestMemberIsPositive<double>(media_stream_track.audio_level);
+        verifier.TestMemberIsPositive<double>(
+            media_stream_track.total_audio_energy);
+        verifier.TestMemberIsPositive<uint64_t>(
             media_stream_track.total_samples_received);
+        verifier.TestMemberIsPositive<double>(
+            media_stream_track.total_samples_duration);
         verifier.TestMemberIsNonNegative<uint64_t>(
             media_stream_track.concealed_samples);
         verifier.TestMemberIsNonNegative<uint64_t>(
@@ -676,8 +681,12 @@ class RTCStatsReportVerifier {
         verifier.TestMemberIsUndefined(media_stream_track.jitter_buffer_delay);
         verifier.TestMemberIsUndefined(
             media_stream_track.jitter_buffer_emitted_count);
+        verifier.TestMemberIsUndefined(media_stream_track.audio_level);
+        verifier.TestMemberIsUndefined(media_stream_track.total_audio_energy);
         verifier.TestMemberIsUndefined(
             media_stream_track.total_samples_received);
+        verifier.TestMemberIsUndefined(
+            media_stream_track.total_samples_duration);
         verifier.TestMemberIsUndefined(media_stream_track.concealed_samples);
         verifier.TestMemberIsUndefined(media_stream_track.concealment_events);
         verifier.TestMemberIsUndefined(
@@ -710,11 +719,6 @@ class RTCStatsReportVerifier {
       verifier.TestMemberIsUndefined(
           media_stream_track.sum_squared_frame_durations);
       // Audio-only members
-      verifier.TestMemberIsNonNegative<double>(media_stream_track.audio_level);
-      verifier.TestMemberIsNonNegative<double>(
-          media_stream_track.total_audio_energy);
-      verifier.TestMemberIsNonNegative<double>(
-          media_stream_track.total_samples_duration);
       // TODO(hbos): |echo_return_loss| and |echo_return_loss_enhancement| are
       // flaky on msan bot (sometimes defined, sometimes undefined). Should the
       // test run until available or is there a way to have it always be
@@ -903,6 +907,9 @@ class RTCStatsReportVerifier {
   bool VerifyRTCAudioSourceStats(const RTCAudioSourceStats& audio_source) {
     RTCStatsVerifier verifier(report_, &audio_source);
     VerifyRTCMediaSourceStats(audio_source, &verifier);
+    verifier.TestMemberIsPositive<double>(audio_source.audio_level);
+    verifier.TestMemberIsPositive<double>(audio_source.total_audio_energy);
+    verifier.TestMemberIsPositive<double>(audio_source.total_samples_duration);
     return verifier.ExpectAllMembersSuccessfullyTested();
   }
 
diff --git a/stats/rtcstats_objects.cc b/stats/rtcstats_objects.cc
index 949e74c4a5..bd24ce1be0 100644
--- a/stats/rtcstats_objects.cc
+++ b/stats/rtcstats_objects.cc
@@ -803,8 +803,10 @@ RTCMediaSourceStats::RTCMediaSourceStats(const RTCMediaSourceStats& other)
 RTCMediaSourceStats::~RTCMediaSourceStats() {}
 
 // clang-format off
-WEBRTC_RTCSTATS_IMPL_NO_MEMBERS(
-    RTCAudioSourceStats, RTCMediaSourceStats, "media-source")
+WEBRTC_RTCSTATS_IMPL(RTCAudioSourceStats, RTCMediaSourceStats, "media-source",
+    &audio_level,
+    &total_audio_energy,
+    &total_samples_duration)
 // clang-format on
 
 RTCAudioSourceStats::RTCAudioSourceStats(const std::string& id,
@@ -812,10 +814,16 @@ RTCAudioSourceStats::RTCAudioSourceStats(const std::string& id,
     : RTCAudioSourceStats(std::string(id), timestamp_us) {}
 
 RTCAudioSourceStats::RTCAudioSourceStats(std::string&& id, int64_t timestamp_us)
-    : RTCMediaSourceStats(std::move(id), timestamp_us) {}
+    : RTCMediaSourceStats(std::move(id), timestamp_us),
+      audio_level("audioLevel"),
+      total_audio_energy("totalAudioEnergy"),
+      total_samples_duration("totalSamplesDuration") {}
 
 RTCAudioSourceStats::RTCAudioSourceStats(const RTCAudioSourceStats& other)
-    : RTCMediaSourceStats(other) {}
+    : RTCMediaSourceStats(other),
+      audio_level(other.audio_level),
+      total_audio_energy(other.total_audio_energy),
+      total_samples_duration(other.total_samples_duration) {}
 
 RTCAudioSourceStats::~RTCAudioSourceStats() {}
author	Henrik Boström <hbos@webrtc.org>	2019-07-03 17:11:10 +0200
committer	Commit Bot <commit-bot@chromium.org>	2019-07-04 08:13:45 +0000
commit	d2c336f892f7c92eecfb15de5b55dde0c1b757fb (patch)
tree	d688c8351e1e2442332e4a799ca58bea1808f2d4
parent	e8fbc5d7026e1d49121b1410318c1324e6bfedcd (diff)
download	webrtc-d2c336f892f7c92eecfb15de5b55dde0c1b757fb.tar.gz