如何写出优雅的 Golang 代码

原文是一篇非常赞的工程总结,推荐。这里记录一些笔记

  1. 代码规范,gofmt, goimports, golangci-lint 等,配合 CI 自动化
  2. 目录结构 Standard Go Project Layout
    1. /pkg 可以被外部使用包模块
    2. /internal 私有模块,不可被外部使用
    3. 不要有 /src 目录
    4. /cmd 生成可执行文件
    5. /api 对外提供的 API 模块
    6. 不要有 model/controller 这样的模块,按照职责拆分,面向接口开发
  3. 不要在 init 做资源初始化,比如 rpc/DB/Redis 等,因为 init 会被隐式执行,会默默初始化资源连接
  4. 推荐的做法是 Client + NewClient,显式初始化连接
  5. init 阶段适合做一些简单、轻量的前置条件判断,比如 flag 设置
  6. 使用 GoMock/sqlmock/httpmock/monkey 做 mock + 测试

Simple techniques to optimise Go programs 介绍了一些简单却非常高效的性能提升方法:

  • Avoid using structures containing pointers as map keys for large maps, use ints or bytes
  • Use strings.Builder to build up strings
  • Use strconv instead of fmt.Sprintf

Power of g in Vim

:[range]g[!]/pattern/cmd

! means do not match pattern, cmd list:

  • d: delete
  • m: move
  • t: copy, or co
  • s: replace

for more info:

动态添加/删除 Hadoop DataNode

添加节点

  1. NameNode 添加节点 etc/hadoop/slaves
  2. 同步 etc/hadoop 配置
  3. 在新节点 ./sbin/hadoop-daemon.sh start datanode
  4. 在 NameNode 刷新 hdfs dfsadmin -refreshNodes

删除节点

  1. etc/hadoop/excludes 写入要删掉的节点地址
  2. 修改 etc/hadoop/hdfs-site.xml:
  <property>
    <name>dfs.hosts.exclude</name>
    <value>/home/web/hadoop/etc/hadoop/excludes</value>
  </property>
  1. 修改 etc/hadoop/mapred-site.xml, 这个是下线 nodemanager
  <property>
    <name>mapred.hosts.exclude</name>
    <value>/home/web/hadoop/etc/hadoop/excludes</value>
    <final>true</final>
  </property>
  1. 修改 etc/hadoop/slaves,去掉要删除的节点
  2. 同步 etc/hadoop/excludesetc/hadoop/slaves 到所有 NameNode
  3. 在 NameNode 执行 hdfs dfsadmin -refreshNodes
  4. hdfs dfsadmin -report 查看要删除的节点状态变化 Normal -> Decommission in progress -> Decommissioned
  5. 在要删除的节点 ./sbin/hadoop-daemon.sh stop datanode,等待 Admin State 变更为 Dead

检查节点

hdfs fsck / 检查文件系统信息,正常是 Status: HEALTHY,如果 Status: CORRUPT 说明 blocks 有损坏,其中 Missing blocks 表示有丢失,但有备份,Missing blocks (with replication factor 1) 表示 block 损坏丢失也没有备份,不可恢复。

可以用 hdfs fsck / -delete 来检查并删除有损坏的 blocks.

调整 JournalNode

  1. 修改 etc/hadoop/hdfs-site.xml:
<property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/cluster</value>
  </property>
  1. 同步到所有节点
  2. 如果是新增节点,要同步 dfs.journalnode.edits.dir 下 edits 文件
  3. 在调整的 journalnode 节点启动/关停: ./sbin/hadoop-daemon.sh start journalnode
  4. 重启 standby NameNode: sbin/hadoop-daemon.sh stop|start namenode
  5. 切换节点为 active: hdfs haadmin -failover nn1 nn2,重启其他 namenode
  6. 检查 NN 状态 hdfs haadmin -getServiceState nn1

Using special SSH key for Git

In ~/.ssh/config:

host github.com
    HostName github.com
    IdentityFile ~/.ssh/id_rsa_github
    User git

don’t forget chmod 600 ~/.ssh/config

Or, use GIT_SSH_COMMAND environment variable:

export GIT_SSH_COMMAND="ssh -i ~/.ssh/id_rsa_example -F /dev/null"

Regex Unicode Scripts

  1. \p{Han} 匹配中文、日语文字,支持简繁体。
  2. \p{Common} 匹配符号
  3. \p{Latin} 匹配拉丁语系
  4. 需要 grep perl 支持,即 grep -P "\p{Han}",或者 rg/ag.
echo '中文/繁體/片仮名/かたかな/カタカナ/katakana' | rg "\p{Han}"   > 中文 繁體 片仮名
echo '中文@mail.com' | rg "\p{Common}"                                > @ .
echo '中文@mail.com' | rg "\p{Latin}"                                 > mail com

Unicode Scripts for more.

Octotree for Safari

From Safari 13, you can only install extension from Mac App Store.


brew install [email protected]
export PATH="/usr/local/opt/[email protected]/bin:$PATH"
# make sure node and npm is v10, cause octotree used gulp 3, which is not working with node 12.

git clone https://github.com/ovity/octotree.git ~/src/octotree
cd ~/src/octotree
git checkout master

npm i
npm install [email protected]
npm run dist
# extension locate in ~/src/octotree/tmp/safari/octotree.safariextension/

cd ~/Library/Safari/Extensions
mv ~/src/octotree/tmp/safari/octotree.safariextension .
  1. Enable Developer menu in Safari
  2. Developer - Show Extension Builder
  3. Add octotree.safariextension and Run

MySQL Prefix Index

CREATE TABLE `t1` (
  `bundle` varchar(300) DEFAULT '' COMMENT 'pkg name',
  `domain` varchar(200) DEFAULT '',
  UNIQUE KEY `idx_bundle_domain` (`bundle`(100),`domain`(100))
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=utf8mb4;

关键部分 bundle(100) 来解决组合索引可能会出现的 Specified key was too long; max key length is 767 bytes 错误。

Deployment with git

#!/bin/sh

set -uex

PATH=$PATH:$HOME/bin
export PATH

DIR=/home/serv/project
cd ${DIR}

REV1=$(git rev-parse --verify HEAD)
git pull origin master
REV2=$(git rev-parse --verify HEAD)
test ${REV1} = ${REV2} && echo "Already updated" && exit

make
test $? -ne 0 && echo "make error" && exit

kill -HUP $(cat logs/run.pid)

主要是通过 git rev-parse --verify HEAD 来获取当前 rev hash,前后对比是否一致,以此来决定是否继续。

logrotate

logrotate - rotates, compresses, and mails system logs

# 0 0 * * * /usr/sbin/logrotate --state=/home/serv/logrotate.state /home/serv/logrotate.log.conf
/home/serv/logs/dev.log
/home/serv/logs/access.log {
    rotate 10
    daily
    compress
    create
    copytruncate
    missingok
    dateext
    dateformat -%Y-%m-%d
    dateyesterday

    sharedscripts
    postrotate
        kill -USR1 `cat /var/run/nginx.pid`
    endscript
}
  1. 要么保存到 /etc 配置,由系统调度。也可以自己通过 crontab 调度控制,这种情况要注意加 --state 来保存状态
  2. 像 nginx 可以通过 kill -USR1 来重新打开日志文件,如果服务不支持可以用 copytruncate,先拷贝再清空

Druid Query in JSON

Druid 可以在 Superset SQL 查询,除此之外可以通过 HTTP+JSON 查询:

curl -X POST '<host:<port>/druid/v2/?pretty' -H 'Content-Type:application/json' -H 'Accept:application/json' -d @query.json
{
  "queryType": "timeseries",
  "dataSource": "cpm_log",
  "granularity": "hour",
  "aggregations": [
    {
      "type": "longSum",
      "name": "requests",
      "fieldName": "req_count_raw"
    },
    {
      "type": "longSum",
      "name": "impressions",
      "fieldName": "win_count"
    },
    {
      "type": "floatSum",
      "name": "revenues",
      "fieldName": "win_price"
    }
  ],
  "postAggregations": [
    {
      "type":"arithmetic",
      "name": "ecpm",
      "fn": "/",
      "fields": [
        {
          "type": "fieldAccess",
          "name": "postAgg_rev",
          "fieldName": "revenues"
        },
        {
          "type": "fieldAccess",
          "name": "postAgg_imps",
          "fieldName": "impressions"
        }
      ]
    }
  ],
  "filter": {
    "type": "and",
    "fields": [
      {
        "type": "selector",
        "dimension": "device_os",
        "value": "android"
      },
      {
        "type": "in",
        "dimension": "req_ad_type",
        "values": ["banner"]
      }
    ]
  },
  "context": {
    "grandTotal": true
  },
  "intervals": [
    "2019-04-09T00:00:00+08:00/2019-04-09T23:00:00+08:00"
  ]
}
  1. queryType 有 timeseries, topN, groupBy, search, timeBoundary
  2. 尽量少用 groupBy 查询,效率不高
  3. topN 查询是通过 metric 来排序
  4. context 可以指定 queryId,这样可以通过 DELETE /druid/v2/{queryId} 取消查询
  5. 去重: {"type": "cardinality", "name": "distinct_pid", "fields": ["ad_pid"]}

RTFM, godruid