更新时间:2023-06-14 10:34:16

场景数据准备

▶︎ 日志规划

将AnyBackup的业务日志根据日志类型划分,可分为:

  • 定时数据保护备份任务日志:anybackup_timing_backup
  • 定时数据保护归档任务日志:anybackup_timing_archive
  • 定时数据保护恢复任务日志:anybackup_timing_recover
  • 定时数据保护清理任务日志anybackup_timing_clea
  • 副本数据管理清理任务日志:anybackup_cdm_clean
  • 副本数据管理备份任务日志:anybackup_cdm_backup

说明:划分的目的是为了下面根据日志类型创建日志库模板,让相同类型的日志进入同一个索引。相同类型不同业务系统的日志可以根据标签进行区分。

日志类型 日志标签 数据库 表名

anybackup_timing_backup

AnyBackup,Timing,backup

BackupServiceDB,AuthServiceDB

BackupServiceDB.BackupService_job,BackupServiceDB.BackupService_jobinstance,BackupServiceDB.group,BackupServiceDB.BackupService_datasource,AuthServiceDB.job_cap

anybackup_timing_archive_instance

AnyBackup,Timing,archive,instance

ArchiveServiceDB

ArchiveServiceDB.archive_task,ArchiveServiceDB.archive_history,ArchiveServiceDB.archive_data_source

anybackup_timing_archive_task

AnyBackup,Timing,archive,task

ArchiveServiceDB

ArchiveServiceDB.archive_task,ArchiveServiceDB.archive_history

anybackup_timing_recover_backup

AnyBackup,Timing,recover,backup

BackupServiceDB

BackupServiceDB.BackupService_job,BackupServiceDB.BackupService_jobinstance,BackupServiceDB.BackupService_datasource

anybackup_timing_recover_archive

AnyBackup,Timing,recover,archive

ArchiveServiceDB

ArchiveServiceDB.archive_task,ArchiveServiceDB.archive_history,ArchiveServiceDB.archive_data_source

anybackup_timing_clean_backup

AnyBackup,Timing,clean,backup

BackupServiceDB,CommonServiceDB

BackupService_cleanjob ,BackupService_jobinstance,BackupService_cleantimepoint,client

anybackup_timing_clean_archive

AnyBackup,Timing,clean,archive

ArchiveServiceDB

clean_task,archive_data_source

anybackup_cdm_clean

AnyBackup,CDM,clean

CDMDispatchServiceDB

time_point_clean_detail、clean_job 

anybackup_cdm_backup_running

AnyBackup,CDM,backup_job,running

CDMDispatchServiceDB, CDMStoreMgmServiceDB

CDMDispatchServiceDB.backup_job, CDMDispatchServiceDB.job_instance, CDMDispatchServiceDB.target_obj, CDMStoreMgmServiceDB.snapshot_pool

anybackup_cdm_backup_log

AnyBackup,CDM,backup,log

CDMDispatchServiceDB

CDMDispatchServiceDB.job_instance,CDMDispatchServiceDB.backup_job,CDMDispatchServiceDB.log

配置日志库

1. 进入数据管理>数据存储>日志库管理>日志库,新建日志库:

  • 模板名称、数据类型和上一步中的日志类型保持一致,即与实际接入进来的类型保持一致,根据上面的日志分类,需要新建anybackup_timing_backup、anybackup_timing_archive_instance、anybackup_timing_archive_task、anybackup_timing_recover_backup、anybackup_timing_recover_archive、anybackup_timing_clean_backup、anybackup_timing_clean_archive、anybackup_cdm_clean、anybackup_cdm_backup_running、anybackup_cdm_backup_log 10个日志库模板。

注意:名称要与此处所写保持一致,实际新建数据根据可采集的日志类型而定划分的目的是为了下面根据日志类型创建日志库模板,让相同类型的日志进入同一个索引。相同类型不同业务系统的日志可以根据标签进行区分。

  • 日志库名称创建规则尽量按照月新建:

  • 某一类数据很多的话创建日志库时可以适当增加分片数量,如每月数据量约100G,则可设置4个分片;
  • 如果不确定某类数据每月日志量大约多少,那就设置三个分片,大小90G,如果日志量超过了也不必担心,对应的索引会以xxx_01,..的方式递增;
  • 单机或者单节点集群环境中,副本数设置为0;多节点集群副本数设置为1;
  • 生命周期保持默认;
  • 如果需要在搜索界面通过关键字搜索,就打开全文索引;
  • 字段设置设置如下:

  • 若无热温迁移需要,按如下配置即可: 

  • 例如,anybackup_timing_backup日志库新建日志库完成如下图:

  • 新建完成后会自动生成对应的日志库索引,切换到日志库标签页,可以看到新生成的索引;如果未自动生成,则日志库创建失败,需重新创建:

配置远程采集

上述10种类型数据均需要配置远程采集数据库同步任务,将数据从AnyBackup数据库采集到AnyRobot中,以anybackup_timing_backup日志为例,采集配置如下所示:

注意:

1. 在执行数据采集配置前,请先确认是否已成功导入解析规则,确保数据解析可正常执行;

2. 以下参数均为示例数据,请按实际环境参数进行配置。

1. 进入数据管理>数据源>远程采集>数据库连接 页面,点击【新建】配置参数如下:

参数名称

配置说明

*名称

可自定义数据库名称

*数据库类型

默认MySQL,按照配置数据库类型修改

*主机

配置目标主机IP

*端口

配置目标主机数据库端口

*用户名

配置目标主机数据库用户名

*字符集

默认utf8,按照配置数据库字符集修改

*数据库名

配置数据库名,点击【检测】检测其网络及认证状态

2. 进入数据管理>数据源>远程采集>数据库连接页面,数据库配置选择数据库同步,点击【新建】配置参数如下:

参数名称

配置说明

*名称

anybackup_timing_backup

*数据库关联

选择上一步保存的数据库

*日志类型

anybackup_timing_backup

日志标签

AnyBackup,Timing,backup

3. 点击【下一步】配置参数如下:

  • *SQL语句:

说明:SQL语句可按照需求修改。

SELECT

       B.*, e.id AS 任务实例id,

       e.`status` AS 结果状态,

       e.backup_type AS 备份方式,

       e.eef_start_time_ms,

       e.eef_end_time_ms,

       CAST(

              (

                     e.eef_end_time_ms - e.eef_start_time_ms

              ) / (1000 * 60 * 60) AS CHAR

       ) AS 已运行时间,

       CAST(

              ROUND(

                     (

                            e.completed_data - e.send_size

                     ) / (1024 * 1024),

                     2

              ) AS CHAR

       ) AS 重删节省空间,

       CAST(

              e.total_size / (1024 * 1024) AS CHAR

       ) AS 总数据量,

       CAST(

              e.completed_data / (1024 * 1024) AS CHAR

       ) AS 传输数据量,

       CAST(

              e.send_size / (1024 * 1024) AS CHAR

       ) AS 实际备份数据量,

       CAST(

              ROUND(

                     (

                            e.completed_data - e.send_size

                     ) / e.completed_data * 100,

                     2

              ) AS CHAR

       ) AS 重删率,

       CAST(

              ROUND(e.speed /(1024 * 1024), 2) AS CHAR

       ) AS 传输速度

FROM

       BackupServiceDB.BackupService_jobinstance AS e

LEFT JOIN (

       SELECT

              A.*, b. NAME AS 任务组

       FROM

              (

                     SELECT

                            D.*, a. NAME AS 任务名称,

    1. STATUS AS 任务状态,

                            a.create_time AS 创建时间,

                            a.update_time AS 最近更新时间,

                            a.type AS 应用类型,

                            (

                                   CASE

                                   WHEN a.virtual_platform_id != '' THEN

                                          concat(

                                                 a.virtual_platform_name,

                                                 '(',

                                                 a.virtual_platform_ip,

                                                 ')'

                                          )

                                   WHEN a.ccloud_id != '' THEN

                                          concat(

                                                 a.ccloud_name,

                                                 '(',

                                                 a.ccloud_ip_domain,

                                                 ')'

                                          )

                                   ELSE

                                          CONCAT(

                                                 a.client_name,

                                                 '(',

                                                 a.client_ip,

                                                 ')'

                                          )

                                   END

                            ) AS 备份对象名称,

                            a.storage_id AS 存储介质,

                            a.client_id AS 客户端id,

                            a.client_name AS 客户端名称,

                            a.client_os_type AS 客户端操作系统,

                            a.virtual_platform_id AS 虚拟化平台id,

                            a.virtual_platform_name AS 虚拟化平台名称,

                            a.virtual_platform_ip AS 虚拟化平台ip,

                            a.ccloud_id AS 云平台id,

                            a.ccloud_ip_domain AS 云平台ip,

                            a.ccloud_name AS 云平台名称,

                            a.group_id,

                            a.id,

                            a.is_backup

                     FROM

                            BackupServiceDB.BackupService_job AS a

                     LEFT JOIN (

                            SELECT

                                   d.job_id,

                                   GROUP_CONCAT(

                                          SUBSTRING_INDEX(d.full_path, '/', - 1) SEPARATOR '\n'

                                   ) AS 数据源名称,

                                   GROUP_CONCAT(d.full_path SEPARATOR '\n') AS 数据源路径

                            FROM

                                   BackupServiceDB.BackupService_datasource AS d

                            GROUP BY

                                   d.job_id

                     ) AS D ON a.id = D.job_id

              ) AS A

       LEFT JOIN BackupServiceDB.`group` AS b ON A.group_id = b.id

) AS B ON e.job_id = B.id

WHERE

       B.is_backup = 1

参数名称

配置说明

*执行计划

*/59 * * * *  一小时采集一次,可根据采集时间需求修改

*采集模式

增量采集

*增量字段

eef_end_time_ms

字段类型

数值

4. 点击【下一步】配置参数如下:

5. 点击【下一步】完成数据库同步配置。

待所有日志类型采集配置都完成之后,等待数据解析成功并进入到AnyRobot中,再继续后续配置操作。

 

场景分析

► 定义KPI指标

面向应用系统建模,按照关键IT服务进行拆分建模:

► KPI具体配置 

说明:

1. 定义KPI指标的阈值,无波动或波动较小的选择静态阈值,随着业务周期性波动的选择动态阈值;

2. 定义KPI对服务的影响权重,根据指标对服务和业务的影响范围和程度,选择对应的权重值。

► 效果展示

1)服务分析器平铺视图

支持快速查看IT服务的整体健康状态:点击服务,可下钻到当前服务KPI详情;点击每个KPI可查看实体详情;可下钻到原始日志精准直观定位导致业务故障的根本原因:

2)服务分析器树视图

支持快速查看服务之间的依赖关系,如下所示:

3)KPI告警配置

设置告警触发条件和告警执行计划,可对指定严重性的KPI进行告警,如下所示:

4)KPI告警记录

可查看告警事件详情,快速定位问题,如下所示:

5)KPI告警降噪

解决无法从大量的告警中提取有价值的信息,实现精准告警,如下所示:

6)AnyBackup业务全景图,建立全局业务运维视图,可清晰查看业务间的依赖关系,展示AnyBackup应用的异常情况,如下所示: