语音消息功能实现

日期: 2024-12-28 状态: 已完成涉及文件: chat_page.dart, send_message.dart, api_client.dart, server.go, pubspec.yaml, AndroidManifest.xml

1. 背景与需求

1.1 需求描述

在聊天界面支持语音消息：

在加号按钮右边添加麦克风按钮
按住录音，松开发送
实现语音消息卡片 UI（播放/暂停 + 时长显示）

1.2 设计原则

参考 e2e-import-export-refactor.md 的教训，避免过度实现：

最小可行：只实现录音和播放，不做语音转文字
复用现有：复用文件上传逻辑，不新建独立接口
渐进增强：先简单实现，后续迭代

2. 技术选型

2.1 录音库选择

库	优点	缺点
record	API 简洁，跨平台	第三方库，版本兼容需注意
flutter_sound	功能丰富	API 复杂
原生实现	最稳定	需维护多份代码

选择：record ^6.0.0（注意：5.x 版本的 record_linux 与接口不兼容，必须用 6.x）

2.2 音频播放库选择

库	优点	缺点
just_audio	支持流式播放 + 自定义请求头	依赖较多
audioplayers	轻量	不支持请求头

选择：just_audio ^0.9.40

理由：需要支持 HTTP 请求头（携带认证 token），流式解密播放

2.3 音频格式

格式：AAC (m4a 容器)
编码：AudioEncoder.aacLc
比特率：128kbps
采样率：44100Hz

理由：AAC 是移动端最兼容的格式，压缩率高，质量好

3. 架构设计

3.1 数据模型变更

后端 SendMessage 结构

type SendMessage struct {
    // ... 其他字段
    Type     string `json:"type"`              // "text" | "file" | "voice"
    Duration int64  `json:"duration,omitempty"` // 语音时长（毫秒）
}

关键决策：voice 作为独立类型，而不是 file 的子类型

理由：语音需要 duration 字段，业务逻辑不同
但存储复用 fileId，不新增字段

前端 SendMessage 模型

enum SendMessageType { text, file, voice }
 
class SendMessage {
  final SendMessageType type;
  final int? duration;  // 语音时长（毫秒）
  // ... 复用 fileId, fileName 等字段
}

3.2 功能流程

1
┌─────────────────────────────────────────────────────────────┐
2
│                      录音 & 发送流程                          │
3
├─────────────────────────────────────────────────────────────┤
4
│                                                             │
5
│  [长按麦克风]                                                │
6
│       ↓                                                     │
7
│  请求录音权限 → 创建临时文件 → 开始录音                        │
8
│       ↓                                                     │
9
│  [松开手指]                                                  │
10
│       ↓                                                     │
11
│  停止录音 → 计算时长                                          │
12
│       ↓                                                     │
13
│  时长 < 500ms? → 放弃发送，提示"录音太短"                     │
14
│       ↓                                                     │
15
│  创建临时消息(乐观更新) → 复用 uploadByPath 上传              │
16
│       ↓                                                     │
17
│  轮询上传进度 → 完成后调用 addSendMessage(type: voice)        │
18
│       ↓                                                     │
19
│  刷新消息列表                                                 │
20
│                                                             │
21
└─────────────────────────────────────────────────────────────┘
22

23
┌─────────────────────────────────────────────────────────────┐
24
│                      播放流程                                │
25
├─────────────────────────────────────────────────────────────┤
26
│                                                             │
27
│  [点击语音卡片]                                              │
28
│       ↓                                                     │
29
│  获取 streamUrl + headers → AudioSource.uri(..., headers)  │
30
│       ↓                                                     │
31
│  流式播放（边下载边解密边播放）                                │
32
│       ↓                                                     │
33
│  监听播放完成 → 重置播放状态                                   │
34
│                                                             │
35
└─────────────────────────────────────────────────────────────┘

4. 实现细节

4.1 权限配置

Android (AndroidManifest.xml)

<uses-permission android:name="android.permission.RECORD_AUDIO"/>

运行时权限请求

final status = await Permission.microphone.request();
if (!status.isGranted) {
  showAppToast(context, '需要麦克风权限');
  return;
}

4.2 录音实现

// 状态变量
final AudioRecorder _audioRecorder = AudioRecorder();
bool _isRecording = false;
String? _recordingPath;
DateTime? _recordingStartTime;
 
// 开始录音
Future<void> _startRecording() async {
  final status = await Permission.microphone.request();
  if (!status.isGranted) {
    showAppToast(context, '需要麦克风权限');
    return;
  }
  
  // 生成临时文件路径
  final dir = await getTemporaryDirectory();
  final timestamp = DateTime.now().millisecondsSinceEpoch;
  _recordingPath = '${dir.path}/voice_$timestamp.m4a';
  _recordingStartTime = DateTime.now();
  
  // 开始录音
  await _audioRecorder.start(
    const RecordConfig(
      encoder: AudioEncoder.aacLc,
      bitRate: 128000,
      sampleRate: 44100,
    ),
    path: _recordingPath!,
  );
  
  setState(() => _isRecording = true);
}
 
// 停止录音并发送
Future<void> _stopRecordingAndSend() async {
  if (!_isRecording) return;
  
  final path = await _audioRecorder.stop();
  final duration = DateTime.now().difference(_recordingStartTime!).inMilliseconds;
  
  setState(() => _isRecording = false);
  
  // 太短不发送
  if (duration < 500) {
    showAppToast(context, '录音太短');
    return;
  }
  
  if (path != null) {
    await _sendVoiceMessage(path, duration);
  }
}

4.3 发送语音消息（复用文件上传）

Future<void> _sendVoiceMessage(String filePath, int durationMs) async {
  final appState = context.read<AppState>();
  final tempId = const Uuid().v4();
  final now = DateTime.now();
  final fileName = 'voice_${now.millisecondsSinceEpoch}.m4a';
  
  // 1. 创建临时消息（乐观更新）
  final tempMessage = SendMessage(
    id: tempId,
    type: SendMessageType.voice,
    duration: durationMs,
    createdAt: now,
  );
  
  setState(() {
    _messages.add(tempMessage);
    _pendingMessageIds.add(tempId);
  });
  _scrollToBottom();
  
  // 2. 复用文件上传
  final result = await appState.api.uploadByPath(
    filePath: filePath,
    fileName: fileName,
    remotePath: '/录音/$fileName',
    skipMetadata: true,  // 语音不需要提取元数据
  );
  
  if (!result.isSuccess) {
    _markMessageFailed(tempId, tempMessage);
    return;
  }
  
  // 3. 轮询上传进度
  final taskId = result.data!.taskId;
  while (true) {
    final progress = await appState.api.getUploadProgress(taskId);
    if (!progress.isSuccess) break;
    
    if (progress.data!.isDone) {
      if (progress.data!.fileId != null) {
        // 4. 上传完成，发送消息
        await appState.api.addSendMessage(
          sessionId: widget.session.id,
          type: 'voice',
          fileId: progress.data!.fileId,
          duration: durationMs,
        );
        
        _pendingMessageIds.remove(tempId);
        _refreshMessages();
        return;
      }
      break;
    }
    
    await Future.delayed(const Duration(milliseconds: 500));
  }
  
  _markMessageFailed(tempId, tempMessage);
}

4.4 语音卡片 UI

Widget _buildVoiceContent(SendMessage message) {
  final isPlaying = _playingMessageId == message.id;
  final durationMs = message.duration ?? 0;
  final durationSec = (durationMs / 1000).ceil();
  
  return GestureDetector(
    onTap: () => _playVoice(message),
    child: Container(
      constraints: const BoxConstraints(minWidth: 100, maxWidth: 200),
      padding: const EdgeInsets.symmetric(horizontal: 12, vertical: 8),
      child: Row(
        mainAxisSize: MainAxisSize.min,
        children: [
          // 播放/暂停按钮
          Container(
            width: 36,
            height: 36,
            decoration: BoxDecoration(
              color: Colors.white.withAlpha(50),
              shape: BoxShape.circle,
            ),
            child: Icon(
              isPlaying ? Icons.pause : Icons.play_arrow,
              color: Colors.white,
              size: 24,
            ),
          ),
          const SizedBox(width: 8),
          // 时长显示
          Text(
            '$durationSec"',
            style: const TextStyle(color: Colors.white, fontSize: 14),
          ),
          const SizedBox(width: 8),
          // 音频波形图标
          Icon(
            Icons.graphic_eq,
            color: Colors.white.withAlpha(180),
            size: 20,
          ),
        ],
      ),
    ),
  );
}

4.5 语音播放（流式解密）

final AudioPlayer _audioPlayer = AudioPlayer();
String? _playingMessageId;
 
Future<void> _playVoice(SendMessage message) async {
  if (message.fileId == null) return;
  
  final appState = context.read<AppState>();
  
  // 如果正在播放同一条，停止播放
  if (_playingMessageId == message.id) {
    await _audioPlayer.stop();
    setState(() => _playingMessageId = null);
    return;
  }
  
  // 停止当前播放
  await _audioPlayer.stop();
  
  try {
    // 获取流式播放 URL 和请求头（包含 token）
    final streamUrl = appState.api.getStreamUrl(message.fileId!);
    final headers = appState.api.getStreamHeaders();
    
    setState(() => _playingMessageId = message.id);
    
    // 使用 AudioSource.uri 支持请求头
    await _audioPlayer.setAudioSource(
      AudioSource.uri(
        Uri.parse(streamUrl),
        headers: headers,
      ),
    );
    _audioPlayer.play();
    
    // 监听播放完成
    _audioPlayer.playerStateStream.listen((state) {
      if (state.processingState == ProcessingState.completed) {
        if (mounted) setState(() => _playingMessageId = null);
      }
    });
  } catch (e) {
    setState(() => _playingMessageId = null);
    showAppToast(context, '播放失败: $e');
  }
}

4.6 麦克风按钮 UI

// 在输入区域，加号按钮后面
const SizedBox(width: 8),
// 麦克风按钮 - 按住录音
GestureDetector(
  onLongPressStart: (_) => _startRecording(),
  onLongPressEnd: (_) => _stopRecordingAndSend(),
  onLongPressCancel: () => _cancelRecording(),
  child: AnimatedContainer(
    duration: const Duration(milliseconds: 200),
    width: plusButtonSize,
    height: plusButtonSize,
    decoration: BoxDecoration(
      color: _isRecording ? Colors.red : overlayColor,
      shape: BoxShape.circle,
    ),
    child: Center(
      child: Icon(
        _isRecording ? Icons.stop : Icons.mic,
        color: Colors.white,
        size: 20,
      ),
    ),
  ),
),

5. 后端变更

5.1 消息类型验证

// addSendMessage 请求验证
if req.Type != "text" && req.Type != "file" && req.Type != "voice" {
    c.JSON(http.StatusBadRequest, gin.H{"error": "type must be 'text', 'file' or 'voice'"})
    return
}

5.2 会话预览

// 更新会话预览
if req.Type == "voice" {
    session.LastMessagePreview = "[语音]"
}

6. 遇到的问题与解决

6.1 record 版本不兼容

问题：record: ^5.1.2 的 record_linux 插件与 record_platform_interface 不兼容

1
Error: The non-abstract class 'RecordLinux' is missing implementations for these members:
2
 - RecordMethodChannelPlatformInterface.startStream

解决：升级到 record: ^6.0.0

6.2 just_audio setUrl 不支持请求头

问题：_audioPlayer.setUrl(url, headers: headers) 不存在

解决：使用 AudioSource.uri：

await _audioPlayer.setAudioSource(
  AudioSource.uri(Uri.parse(streamUrl), headers: headers),
);

6.3 方法命名冲突

问题：新增的 _markMessageFailed 与现有方法重名

解决：复用现有方法，不新增

7. 未来增强方向

暂不实现，等需求明确后再考虑：

语音转文字：需要选择 STT 服务
录音波形可视化：需要实时音频分析
录音取消手势：向上滑动取消
录音时长限制：超过 N 秒自动发送

8. 相关文件

client/lib/ui/chat_page.dart - 录音/播放/UI 实现
client/lib/core/models/send_message.dart - 数据模型
client/lib/core/api/api_client.dart - API 调用
client/pubspec.yaml - 依赖配置
client/android/app/src/main/AndroidManifest.xml - 权限配置
core/internal/api/server.go - 后端 API