• [二次开发] 本地环境消费开源kafka,将数据写入mrs kafka
    生产者代码:报错情况:
  • [问题求助] mrs 提交任务到yarn报错 Could not load service provider for table factories
    mrs 提交任务到yarn报错   Could not load service provider for table factories
  • [问题求助] 请问如何使用mrs flinksql与oceanbase对接
    能否使用flink-sql-connector-oceanbase-cdc 进行连接,请问mrs flink是否支持这么做,如果支持,我该把jar放入那个位置才能生效
  • [问题求助] mrs是否支持 flinkserver 对接oceanbase
    flink sql:CREATE TABLE ob_tbl1 (     col1 INT,     col2 VARCHAR(20),     col3 INT)     WITH ('connector' = 'oceanbase-cdc',     'scan.startup.mode' = 'initial',     'tenant-name' = 'mq_t1',     'username' = 'root@mq_t1',     'password' = 'pswd',     'database-name' = 'test_ob_to_kafka',     'table-name' = 'tbl1',     'hostname' = '10.10.101.64',     'port' = '2881',     'rootserver-list' = '10.10.101.64:2882:2881',     'logproxy.host' = '10.10.101.64',     'logproxy.port' = '2983');
  • [开发应用] DWS支持实时流数据读取吗?
    用flink实时读取dws数据,结果执行一会显示任务成功,不能流式执行?难道dws不支持吗?
  • [开发应用] DWS能作为实时数据的数仓分层吗?就是不仅能实时写入,还能实时读取,处理后再实时写入
    DWS能作为实时数据的数仓分层吗?就是不仅能实时写入,还能实时读取,处理后再实时写入
  • [问题求助] FusionInsight_HD_8.2.0.1产品,在Flink SQL客户端中select 'hello'报错KeeperErrorCode = ConnectionLoss for /flink_base/flink
    flinkSQL client中select 还是报错的,请帮忙指点下,哪里有问题?谢谢org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$SessionClosedRequireAuthException: KeeperErrorCode = Session closed because client failed to authenticate for /flink_base/flink或者org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /flink_base/flinkzookeeper已经启动,192.168.0.82:24002 ,而且zookeeper中的ACL权限已经设置,但是在设置配额失败[zk: 192.168.0.82:24002(CONNECTED) 5] setquota -n 1000000 /flink_base/flink Insufficient permission : /flink_base/flink tail -f /home/dmp/app/ficlient/Flink/flink/log/flink-root-sql-client-192-168-0-85.log  中的日志如下flink-conf.yaml中的全部配置如下akka.ask.timeout: 120 s akka.client-socket-worker-pool.pool-size-factor: 1.0 akka.client-socket-worker-pool.pool-size-max: 2 akka.client-socket-worker-pool.pool-size-min: 1 akka.framesize: 10485760b akka.log.lifecycle.events: false akka.lookup.timeout: 30 s akka.server-socket-worker-pool.pool-size-factor: 1.0 akka.server-socket-worker-pool.pool-size-max: 2 akka.server-socket-worker-pool.pool-size-min: 1 akka.ssl.enabled: true akka.startup-timeout: 10 s akka.tcp.timeout: 60 s akka.throughput: 15 blob.fetch.backlog: 1000 blob.fetch.num-concurrent: 50 blob.fetch.retries: 50 blob.server.port: 32456-32520 blob.service.ssl.enabled: true classloader.check-leaked-classloader: false classloader.resolve-order: child-first client.rpc.port: 32651-32720 client.timeout: 120 s compiler.delimited-informat.max-line-samples: 10 compiler.delimited-informat.max-sample-len: 2097152 compiler.delimited-informat.min-line-samples: 2 env.hadoop.conf.dir: /home/dmp/app/ficlient/Flink/flink/conf env.java.opts.client: -Djava.io.tmpdir=/home/dmp/app/ficlient/Flink/tmp env.java.opts.jobmanager: -Djava.security.krb5.conf=/opt/huawei/Bigdata/common/runtime/krb5.conf -Djava.io.tmpdir=${PWD}/tmp -Des.security.indication=true env.java.opts.taskmanager: -Djava.security.krb5.conf=/opt/huawei/Bigdata/common/runtime/krb5.conf -Djava.io.tmpdir=${PWD}/tmp -Des.security.indication=true env.java.opts: -Xloggc:<LOG_DIR>/gc.log -XX:+PrintGCDetails -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=20M -Djdk.tls.ephemeralDHKeySize=3072 -Djava.library.path=${HADOOP_COMMON_HOME}/lib/native -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false -Dbeetle.application.home.path=/opt/huawei/Bigdata/common/runtime/security/config -Dwcc.configuration.path=/opt/huawei/Bigdata/common/runtime/security/config -Dscc.configuration.path=/opt/huawei/Bigdata/common/runtime/securityforscc/config -Dscc.bigdata.common=/opt/huawei/Bigdata/common/runtime env.yarn.conf.dir: /home/dmp/app/ficlient/Flink/flink/conf flink.security.enable: true flinkserver.alarm.cert.skip: true flinkserver.host.ip: fs.output.always-create-directory: false fs.overwrite-files: false heartbeat.interval: 10000 heartbeat.timeout: 120000 high-availability.job.delay: 10 s high-availability.storageDir: hdfs://hacluster/flink/recovery high-availability.zookeeper.client.acl: creator high-availability.zookeeper.client.connection-timeout: 90000 high-availability.zookeeper.client.max-retry-attempts: 5 high-availability.zookeeper.client.retry-wait: 5000 high-availability.zookeeper.client.session-timeout: 90000 high-availability.zookeeper.client.tolerate-suspended-connections: true high-availability.zookeeper.path.root: /flink high-availability.zookeeper.path.under.quota: /flink_base high-availability.zookeeper.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 high-availability.zookeeper.quota.enabled: true high-availability: zookeeper job.alarm.enable: true jobmanager.heap.size: 1024mb jobmanager.web.403-redirect-url: https://192.168.0.82:28443/web/pages/error/403.html jobmanager.web.404-redirect-url: https://192.168.0.82:28443/web/pages/error/404.html jobmanager.web.415-redirect-url: https://192.168.0.82:28443/web/pages/error/415.html jobmanager.web.500-redirect-url: https://192.168.0.82:28443/web/pages/error/500.html jobmanager.web.access-control-allow-origin: * jobmanager.web.accesslog.enable: true jobmanager.web.allow-access-address: * jobmanager.web.backpressure.cleanup-interval: 600000 jobmanager.web.backpressure.delay-between-samples: 50 jobmanager.web.backpressure.num-samples: 100 jobmanager.web.backpressure.refresh-interval: 60000 jobmanager.web.cache-directive: no-store jobmanager.web.checkpoints.disable: false jobmanager.web.checkpoints.history: 10 jobmanager.web.expires-time: 0 jobmanager.web.history: 5 jobmanager.web.logout-timer: 600000 jobmanager.web.pragma-value: no-cache jobmanager.web.refresh-interval: 3000 jobmanager.web.ssl.enabled: false jobmanager.web.x-frame-options: DENY library-cache-manager.cleanup.interval: 3600 metrics.internal.query-service.port: 28844-28943 metrics.reporter.alarm.factory.class: com.huawei.mrs.flink.alarm.FlinkAlarmReporterFactory metrics.reporter.alarm.interval: 30 s metrics.reporter.alarm.job.alarm.checkpoint.consecutive.failures.num: 5 metrics.reporter.alarm.job.alarm.failure.restart.rate: 80 metrics.reporter.alarm.job.alarm.task.backpressure.duration: 180 s metrics.reporter: alarm nettyconnector.message.delimiter: $_ nettyconnector.registerserver.topic.storage: /flink/nettyconnector nettyconnector.sinkserver.port.range: 28444-28843 nettyconnector.ssl.enabled: false parallelism.default: 1 query.client.network-threads: 0 query.proxy.network-threads: 0 query.proxy.ports: 32541-32560 query.proxy.query-threads: 0 query.server.network-threads: 0 query.server.ports: 32521-32540 query.server.query-threads: 0 resourcemanager.taskmanager-timeout: 300000 rest.await-leader-timeout: 30000 rest.bind-port: 32261-32325 rest.client.max-content-length: 104857600 rest.connection-timeout: 15000 rest.idleness-timeout: 300000 rest.retry.delay: 3000 rest.retry.max-attempts: 20 rest.server.max-content-length: 104857600 rest.server.numThreads: 4 restart-strategy.failure-rate.delay: 10 s restart-strategy.failure-rate.failure-rate-interval: 60 s restart-strategy.failure-rate.max-failures-per-interval: 1 restart-strategy.fixed-delay.attempts: 3 restart-strategy.fixed-delay.delay: 10 s restart-strategy: none security.cookie: 9477298cd52a3e409ed0bc570bdc795179fcc7c301a1225e22f47fe0a3db47c2 security.enable: true security.kerberos.login.contexts: Client,KafkaClient security.kerberos.login.keytab: security.kerberos.login.principal: security.kerberos.login.use-ticket-cache: true security.networkwide.listen.restrict: true security.ssl.algorithms: TLS_DHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_DHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 security.ssl.enabled: false security.ssl.encrypt.enabled: false security.ssl.key-password: Bapuser@9000 security.ssl.keystore-password: Bapuser@9000 security.ssl.keystore: ssl/flink.keystore security.ssl.protocol: TLSv1.2 security.ssl.rest.enabled: false security.ssl.truststore-password: Bapuser@9000 security.ssl.truststore: ssl/flink.truststore security.ssl.verify-hostname: false slot.idle.timeout: 50000 slot.request.timeout: 300000 state.backend.fs.checkpointdir: hdfs://hacluster/flink/checkpoints state.backend.fs.memory-threshold: 20kb state.backend.incremental: true state.backend: rocksdb state.savepoints.dir: hdfs://hacluster/flink/savepoint task.cancellation.interval: 30000 task.cancellation.timeout: 180000 taskmanager.data.port: 32391-32455 taskmanager.data.ssl.enabled: false taskmanager.debug.memory.logIntervalMs: 0 taskmanager.debug.memory.startLogThread: false taskmanager.heap.size: 1024mb taskmanager.initial-registration-pause: 500 ms taskmanager.max-registration-pause: 30 s taskmanager.maxRegistrationDuration: 5 min taskmanager.memory.fraction: 0.7 taskmanager.memory.off-heap: false taskmanager.memory.preallocate: false taskmanager.memory.segment-size: 32768 taskmanager.network.detailed-metrics: false taskmanager.network.memory.buffers-per-channel: 2 taskmanager.network.memory.floating-buffers-per-gate: 8 taskmanager.network.memory.fraction: 0.1 taskmanager.network.memory.max: 1gb taskmanager.network.memory.min: 64mb taskmanager.network.netty.client.connectTimeoutSec: 300 taskmanager.network.netty.client.numThreads: -1 taskmanager.network.netty.num-arenas: -1 taskmanager.network.netty.sendReceiveBufferSize: 4096 taskmanager.network.netty.server.backlog: 0 taskmanager.network.netty.server.numThreads: -1 taskmanager.network.netty.transport: nio taskmanager.network.numberOfBuffers: 2048 taskmanager.network.request-backoff.initial: 100 taskmanager.network.request-backoff.max: 10000 taskmanager.numberOfTaskSlots: 1 taskmanager.refused-registration-pause: 10 s taskmanager.registration.timeout: 5 min taskmanager.rpc.port: 32326-32390 taskmanager.runtime.hashjoin-bloom-filters: false taskmanager.runtime.max-fan: 128 taskmanager.runtime.sort-spilling-threshold: 0.8 use.path.filesystem: true use.smarterleaderlatch: true web.submit.enable: false web.timeout: 10000 yarn.application-attempt-failures-validity-interval: 600000 yarn.application-attempts: 5 yarn.application-master.port: 32586-32650 yarn.heap-cutoff-min: 384 yarn.heap-cutoff-ratio: 0.25 yarn.heartbeat-delay: 5 yarn.heartbeat.container-request-interval: 500 yarn.maximum-failed-containers: 5 yarn.per-job-cluster.include-user-jar: ORDER zk.ssl.enabled: false zookeeper.clientPort.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 zookeeper.root.acl: OPEN zookeeper.sasl.disable: false zookeeper.sasl.login-context-name: Client zookeeper.sasl.service-name: zookeeper zookeeper.secureClientPort.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 
  • [问题求助] FusionInsight_HD_8.2.0.1产品,在Flink SQL客户端中select 'hello'报错KeeperErrorCode = ConnectionLoss for /flink_base/flink
    flinkSQL client中select 还是报错的,请帮忙指点下,哪里有问题?谢谢org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$SessionClosedRequireAuthException: KeeperErrorCode = Session closed because client failed to authenticate for /flink_base/flink或者org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /flink_base/flinkzookeeper已经启动,192.168.0.82:24002 ,而且zookeeper中的ACL权限已经设置,但是在设置配额失败[zk: 192.168.0.82:24002(CONNECTED) 2] create /flink_base/flink_base Created /flink_base/flink_base [zk: 192.168.0.82:24002(CONNECTED) 3] ls /flink_base/ Path must not end with / character [zk: 192.168.0.82:24002(CONNECTED) 4] ls /flink_base [flink, flink_base] [zk: 192.168.0.82:24002(CONNECTED) 5] [zk: 192.168.0.82:24002(CONNECTED) 5] [zk: 192.168.0.82:24002(CONNECTED) 5] [zk: 192.168.0.82:24002(CONNECTED) 5] setquota -n 1000000 /flink_base/flink Insufficient permission : /flink_base/flink [zk: 192.168.0.82:24002(CONNECTED) 6] getAcl /flink_base/flink 'world,'anyone : cdrwa [zk: 192.168.0.82:24002(CONNECTED) 7] setAcl /flink_base/flink world:anyone:rwcda [zk: 192.168.0.82:24002(CONNECTED) 8] setquota -n 1000000 /flink_base/flink Insufficient permission : /flink_base/flink [zk: 192.168.0.82:24002(CONNECTED) 9] getAcl /flink_base/ Path must not end with / character [zk: 192.168.0.82:24002(CONNECTED) 10] getAcl /flink_base 'world,'anyone : cdrwa [zk: 192.168.0.82:24002(CONNECTED) 11] getAcl /flink_base/flink 'world,'anyone : cdrwa [zk: 192.168.0.82:24002(CONNECTED) 12] ls /zookeeper/quota [beeline, elasticsearch, flink_base, graphbase, hadoop, hadoop-adapter-data, hadoop-flag, hadoop-ha, hbase, hdfs-acl-log, hive, hiveserver2, kafka, loader, mr-ha, rmstore, sparkthriftserver, sparkthriftserver2x, sparkthriftserver2x_sparkInternal_HAMode, yarn-leader-election] [zk: 192.168.0.82:24002(CONNECTED) 13] ls /zookeeper/quota/flink_base [zookeeper_limits, zookeeper_stats] [zk: 192.168.0.82:24002(CONNECTED) 5] setquota -n 1000000 /flink_base/flink Insufficient permission : /flink_base/flink tail -f /home/dmp/app/ficlient/Flink/flink/log/flink-root-sql-client-192-168-0-85.log  中的日志如下flink-conf.yaml中的全部配置如下akka.ask.timeout: 120 s akka.client-socket-worker-pool.pool-size-factor: 1.0 akka.client-socket-worker-pool.pool-size-max: 2 akka.client-socket-worker-pool.pool-size-min: 1 akka.framesize: 10485760b akka.log.lifecycle.events: false akka.lookup.timeout: 30 s akka.server-socket-worker-pool.pool-size-factor: 1.0 akka.server-socket-worker-pool.pool-size-max: 2 akka.server-socket-worker-pool.pool-size-min: 1 akka.ssl.enabled: true akka.startup-timeout: 10 s akka.tcp.timeout: 60 s akka.throughput: 15 blob.fetch.backlog: 1000 blob.fetch.num-concurrent: 50 blob.fetch.retries: 50 blob.server.port: 32456-32520 blob.service.ssl.enabled: true classloader.check-leaked-classloader: false classloader.resolve-order: child-first client.rpc.port: 32651-32720 client.timeout: 120 s compiler.delimited-informat.max-line-samples: 10 compiler.delimited-informat.max-sample-len: 2097152 compiler.delimited-informat.min-line-samples: 2 env.hadoop.conf.dir: /home/dmp/app/ficlient/Flink/flink/conf env.java.opts.client: -Djava.io.tmpdir=/home/dmp/app/ficlient/Flink/tmp env.java.opts.jobmanager: -Djava.security.krb5.conf=/opt/huawei/Bigdata/common/runtime/krb5.conf -Djava.io.tmpdir=${PWD}/tmp -Des.security.indication=true env.java.opts.taskmanager: -Djava.security.krb5.conf=/opt/huawei/Bigdata/common/runtime/krb5.conf -Djava.io.tmpdir=${PWD}/tmp -Des.security.indication=true env.java.opts: -Xloggc:<LOG_DIR>/gc.log -XX:+PrintGCDetails -XX:-OmitStackTraceInFastThrow -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=20M -Djdk.tls.ephemeralDHKeySize=3072 -Djava.library.path=${HADOOP_COMMON_HOME}/lib/native -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false -Dbeetle.application.home.path=/opt/huawei/Bigdata/common/runtime/security/config -Dwcc.configuration.path=/opt/huawei/Bigdata/common/runtime/security/config -Dscc.configuration.path=/opt/huawei/Bigdata/common/runtime/securityforscc/config -Dscc.bigdata.common=/opt/huawei/Bigdata/common/runtime env.yarn.conf.dir: /home/dmp/app/ficlient/Flink/flink/conf flink.security.enable: true flinkserver.alarm.cert.skip: true flinkserver.host.ip: fs.output.always-create-directory: false fs.overwrite-files: false heartbeat.interval: 10000 heartbeat.timeout: 120000 high-availability.job.delay: 10 s high-availability.storageDir: hdfs://hacluster/flink/recovery high-availability.zookeeper.client.acl: creator high-availability.zookeeper.client.connection-timeout: 90000 high-availability.zookeeper.client.max-retry-attempts: 5 high-availability.zookeeper.client.retry-wait: 5000 high-availability.zookeeper.client.session-timeout: 90000 high-availability.zookeeper.client.tolerate-suspended-connections: true high-availability.zookeeper.path.root: /flink high-availability.zookeeper.path.under.quota: /flink_base high-availability.zookeeper.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 high-availability.zookeeper.quota.enabled: true high-availability: zookeeper job.alarm.enable: true jobmanager.heap.size: 1024mb jobmanager.web.403-redirect-url: https://192.168.0.82:28443/web/pages/error/403.html jobmanager.web.404-redirect-url: https://192.168.0.82:28443/web/pages/error/404.html jobmanager.web.415-redirect-url: https://192.168.0.82:28443/web/pages/error/415.html jobmanager.web.500-redirect-url: https://192.168.0.82:28443/web/pages/error/500.html jobmanager.web.access-control-allow-origin: * jobmanager.web.accesslog.enable: true jobmanager.web.allow-access-address: * jobmanager.web.backpressure.cleanup-interval: 600000 jobmanager.web.backpressure.delay-between-samples: 50 jobmanager.web.backpressure.num-samples: 100 jobmanager.web.backpressure.refresh-interval: 60000 jobmanager.web.cache-directive: no-store jobmanager.web.checkpoints.disable: false jobmanager.web.checkpoints.history: 10 jobmanager.web.expires-time: 0 jobmanager.web.history: 5 jobmanager.web.logout-timer: 600000 jobmanager.web.pragma-value: no-cache jobmanager.web.refresh-interval: 3000 jobmanager.web.ssl.enabled: false jobmanager.web.x-frame-options: DENY library-cache-manager.cleanup.interval: 3600 metrics.internal.query-service.port: 28844-28943 metrics.reporter.alarm.factory.class: com.huawei.mrs.flink.alarm.FlinkAlarmReporterFactory metrics.reporter.alarm.interval: 30 s metrics.reporter.alarm.job.alarm.checkpoint.consecutive.failures.num: 5 metrics.reporter.alarm.job.alarm.failure.restart.rate: 80 metrics.reporter.alarm.job.alarm.task.backpressure.duration: 180 s metrics.reporter: alarm nettyconnector.message.delimiter: $_ nettyconnector.registerserver.topic.storage: /flink/nettyconnector nettyconnector.sinkserver.port.range: 28444-28843 nettyconnector.ssl.enabled: false parallelism.default: 1 query.client.network-threads: 0 query.proxy.network-threads: 0 query.proxy.ports: 32541-32560 query.proxy.query-threads: 0 query.server.network-threads: 0 query.server.ports: 32521-32540 query.server.query-threads: 0 resourcemanager.taskmanager-timeout: 300000 rest.await-leader-timeout: 30000 rest.bind-port: 32261-32325 rest.client.max-content-length: 104857600 rest.connection-timeout: 15000 rest.idleness-timeout: 300000 rest.retry.delay: 3000 rest.retry.max-attempts: 20 rest.server.max-content-length: 104857600 rest.server.numThreads: 4 restart-strategy.failure-rate.delay: 10 s restart-strategy.failure-rate.failure-rate-interval: 60 s restart-strategy.failure-rate.max-failures-per-interval: 1 restart-strategy.fixed-delay.attempts: 3 restart-strategy.fixed-delay.delay: 10 s restart-strategy: none security.cookie: 9477298cd52a3e409ed0bc570bdc795179fcc7c301a1225e22f47fe0a3db47c2 security.enable: true security.kerberos.login.contexts: Client,KafkaClient security.kerberos.login.keytab: security.kerberos.login.principal: security.kerberos.login.use-ticket-cache: true security.networkwide.listen.restrict: true security.ssl.algorithms: TLS_DHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_DHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 security.ssl.enabled: false security.ssl.encrypt.enabled: false security.ssl.key-password: Bapuser@9000 security.ssl.keystore-password: Bapuser@9000 security.ssl.keystore: ssl/flink.keystore security.ssl.protocol: TLSv1.2 security.ssl.rest.enabled: false security.ssl.truststore-password: Bapuser@9000 security.ssl.truststore: ssl/flink.truststore security.ssl.verify-hostname: false slot.idle.timeout: 50000 slot.request.timeout: 300000 state.backend.fs.checkpointdir: hdfs://hacluster/flink/checkpoints state.backend.fs.memory-threshold: 20kb state.backend.incremental: true state.backend: rocksdb state.savepoints.dir: hdfs://hacluster/flink/savepoint task.cancellation.interval: 30000 task.cancellation.timeout: 180000 taskmanager.data.port: 32391-32455 taskmanager.data.ssl.enabled: false taskmanager.debug.memory.logIntervalMs: 0 taskmanager.debug.memory.startLogThread: false taskmanager.heap.size: 1024mb taskmanager.initial-registration-pause: 500 ms taskmanager.max-registration-pause: 30 s taskmanager.maxRegistrationDuration: 5 min taskmanager.memory.fraction: 0.7 taskmanager.memory.off-heap: false taskmanager.memory.preallocate: false taskmanager.memory.segment-size: 32768 taskmanager.network.detailed-metrics: false taskmanager.network.memory.buffers-per-channel: 2 taskmanager.network.memory.floating-buffers-per-gate: 8 taskmanager.network.memory.fraction: 0.1 taskmanager.network.memory.max: 1gb taskmanager.network.memory.min: 64mb taskmanager.network.netty.client.connectTimeoutSec: 300 taskmanager.network.netty.client.numThreads: -1 taskmanager.network.netty.num-arenas: -1 taskmanager.network.netty.sendReceiveBufferSize: 4096 taskmanager.network.netty.server.backlog: 0 taskmanager.network.netty.server.numThreads: -1 taskmanager.network.netty.transport: nio taskmanager.network.numberOfBuffers: 2048 taskmanager.network.request-backoff.initial: 100 taskmanager.network.request-backoff.max: 10000 taskmanager.numberOfTaskSlots: 1 taskmanager.refused-registration-pause: 10 s taskmanager.registration.timeout: 5 min taskmanager.rpc.port: 32326-32390 taskmanager.runtime.hashjoin-bloom-filters: false taskmanager.runtime.max-fan: 128 taskmanager.runtime.sort-spilling-threshold: 0.8 use.path.filesystem: true use.smarterleaderlatch: true web.submit.enable: false web.timeout: 10000 yarn.application-attempt-failures-validity-interval: 600000 yarn.application-attempts: 5 yarn.application-master.port: 32586-32650 yarn.heap-cutoff-min: 384 yarn.heap-cutoff-ratio: 0.25 yarn.heartbeat-delay: 5 yarn.heartbeat.container-request-interval: 500 yarn.maximum-failed-containers: 5 yarn.per-job-cluster.include-user-jar: ORDER zk.ssl.enabled: false zookeeper.clientPort.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 zookeeper.root.acl: OPEN zookeeper.sasl.disable: false zookeeper.sasl.login-context-name: Client zookeeper.sasl.service-name: zookeeper zookeeper.secureClientPort.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 
  • [问题求助] FusionInsight_HD_8.2.0.1产品,在Flink SQL客户端中select 'hello'报错KeeperErrorCode = ConnectionLoss for /flink_base/flink
    1.在flink sql client中执行sql  直接报错[ERROR] Could not execute SQL statement. Reason: org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /flink_base/flink 2.而且进入zookeeper中查询也是报错,求解求解[omm@192-168-0-82 zookeeper]$ pwd /opt/huawei/Bigdata/FusionInsight_HD_8.2.0.1/install/FusionInsight-Zookeeper-3.6.3/zookeeper [omm@192-168-0-82 zookeeper]$ bin/zkCli.sh -server 192.168.0.82:24002 Connecting to 192.168.0.82:24002 Welcome to ZooKeeper! JLine support is enabled  WATCHER::  WatchedEvent state:SyncConnected type:None path:null [zk: 192.168.0.82:24002(CONNECTING) 0] ls / KeeperErrorCode = Session closed because client failed to authenticate for / [zk: 192.168.0.82:24002(CONNECTED) 1] WATCHER::  WatchedEvent state:Disconnected type:None path:null  WATCHER::  WatchedEvent state:SyncConnected type:None path:null  WATCHER::  WatchedEvent state:Disconnected type:None path:null 后面是一直循环WATCHER:,flink-conf.yaml中的部分设置如下 flink.security.enable: true flinkserver.alarm.cert.skip: true flinkserver.host.ip: fs.output.always-create-directory: false fs.overwrite-files: false heartbeat.interval: 10000 heartbeat.timeout: 120000 high-availability.job.delay: 10 s high-availability.storageDir: hdfs://hacluster/flink/recovery high-availability.zookeeper.client.acl: creator high-availability.zookeeper.client.connection-timeout: 90000 high-availability.zookeeper.client.max-retry-attempts: 5 high-availability.zookeeper.client.retry-wait: 5000 high-availability.zookeeper.client.session-timeout: 90000 high-availability.zookeeper.client.tolerate-suspended-connections: true high-availability.zookeeper.path.root: /flink high-availability.zookeeper.path.under.quota: /flink_base high-availability.zookeeper.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 high-availability.zookeeper.quota.enabled: true high-availability: zookeeper yarn.application-attempts: 5 yarn.application-master.port: 32586-32650 yarn.heap-cutoff-min: 384 yarn.heap-cutoff-ratio: 0.25 yarn.heartbeat-delay: 5 yarn.heartbeat.container-request-interval: 500 yarn.maximum-failed-containers: 5 yarn.per-job-cluster.include-user-jar: ORDER zk.ssl.enabled: false zookeeper.clientPort.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 zookeeper.root.acl: OPEN zookeeper.sasl.disable: false zookeeper.sasl.login-context-name: Client zookeeper.sasl.service-name: zookeeper zookeeper.secureClientPort.quorum: 192.168.0.82:24002,192.168.0.81:24002,192.168.0.80:24002 
  • [问题求助] flinksql消费DRS过来的kafka消息,关键字怎么处理
    kafka消息格式,参考cid:link_0带有table、sql、old等关键字,flinksql建表时sql校验不通过,这种情况要如何处理
  • [技术干货] Flink(一)-基本概念【转】
    这篇文章主要按照以下思路,简单的交流一下Flink的基本概念和用途。自知资历尚浅,见闻有限,如有纰漏还望指正![点击并拖拽以移动]Flink 简介在当前的互联网用户,设备,服务等激增的时代下,其产生的数据量已不可同日而语了。各种业务场景都会有着大量的数据产生,如何对这些数据进行有效地处理是很多企业需要考虑的问题。以往我们所熟知的Map Reduce,Storm,Spark等框架可能在某些场景下已经没法完全地满足用户的需求,或者是实现需求所付出的代价,无论是代码量或者架构的复杂程度可能都没法满足预期的需求。新场景的出现催产出新的技术,Flink即为实时流的处理提供了新的选择。Apache Flink就是近些年来在社区中比较活跃的分布式处理框架,加上阿里在中国的推广,相信它在未来的竞争中会更具优势。 Flink的产生背景不过多介绍,感兴趣的可以Google一下。Flink相对简单的编程模型加上其高吞吐、低延迟、高性能以及支持exactly-once语义的特性,让它在工业生产中较为出众。相信正如很多博客资料等写的那样"Flink将会成为企业内部主流的数据处理框架,最终成为下一代大数据处理标准。" 2. Flink 架构中的服务类型下面是从Flink官网截取的一张架构图:[点击并拖拽以移动]在Flink运行时涉及到的进程主要有以下两个: JobManager:主要负责调度task,协调checkpoint已经错误恢复等。当客户端将打包好的任务提交到JobManager之后,JobManager就会根据注册的TaskManager资源信息将任务分配给有资源的TaskManager,然后启动运行任务。TaskManger从JobManager获取task信息,然后使用slot资源运行task; TaskManager:执行数据流的task,一个task通过设置并行度,可能会有多个subtask。 每个TaskManager都是作为一个独立的JVM进程运行的。他主要负责在独立的线程执行的operator。其中能执行多少个operator取决于每个taskManager指定的slots数量。Task slot是Flink中最小的资源单位。假如一个taskManager有3个slot,他就会给每个slot分配1/3的内存资源,目前slot不会对cpu进行隔离。同一个taskManager中的slot会共享网络资源和心跳信息。当然在Flink中并不是一个slot只可以执行一个task,在某些情况下,一个slot中也可能执行多个task,如下:[点击并拖拽以移动]一般情况下,flink都是默认允许共用slot的,即便不是相同的task,只要都是来同一个job即可。共享slot的好处有以下两点:当Job的最高并行度正好和flink集群的slot数量相等时,则不需要计算总的task数量。例如,最高并行度是6时,则只需要6个slot,各个subtask都可以共享这6个slot; 2. 共享slot可以优化资源管理。如下图,非资源密集型subtask source/map在不共享slot时会占用6个slot,而在共享的情况下,可以保证其他的资源密集型subtask也能使用这6个slot,保证了资源分配。[点击并拖拽以移动]Flink中的数据Flink中的数据主要分为两类:有界数据流(Bounded streams)和无界数据流(Unbounded streams)。 3.1 无界数据流顾名思义,无界数据流就是指有始无终的数据,数据一旦开始生成就会持续不断的产生新的数据,即数据没有时间边界。无界数据流需要持续不断地处理。 3.2 有界数据流相对而言,有界数据流就是指输入的数据有始有终。例如数据可能是一分钟或者一天的交易数据等等。处理这种有界数据流的方式也被称之为批处理:[点击并拖拽以移动]需要注意的是,我们一般所说的数据流是指数据集,而流数据则是指数据流中的数据。 4. Flink中的编程模型 4.1 编程模型在Flink,编程模型的抽象层级主要分为以下4种,越往下抽象度越低,编程越复杂,灵活度越高。[点击并拖拽以移动]这里先不一一介绍,后续会做详细说明。这4层中,一般用于开发的是第三层,即DataStrem/DataSetAPI。用户可以使用DataStream API处理无界数据流,使用DataSet API处理有界数据流。同时这两个API都提供了各种各样的接口来处理数据。例如常见的map、filter、flatMap等等,而且支持python,scala,java等编程语言,后面的demo主要以scala为主。 4.2 程序结构与其他的分布式处理引擎类似,Flink也遵循着一定的程序架构。下面以常见的WordCount为例:val env = ExecutionEnvironment.getExecutionEnvironment// get input data val text = env.readTextFile("/path/to/file")val counts = text.flatMap { _.toLowerCase.split("\W+") filter { .nonEmpty } } .map { (, 1) } .groupBy(0) .sum(1)counts.writeAsCsv(outputPath, "\n", " ")[点击并拖拽以移动]下面我们分解一下这个程序。第一步,我们需要获取一个ExecutionEnvironment(如果是实时数据流的话我们需要创建一个StreamExecutionEnvironment)。这个对象可以设置执行的一些参数以及添加数据源。所以在程序的main方法中我们都要通过类似下面的语句获取到这个对象:val env = ExecutionEnvironment.getExecutionEnvironment[点击并拖拽以移动]第二步,我们需要为这个应用添加数据源。这个程序中是通过读取文本文件的方式获取数据。在实际开发中我们的数据源可能有很多中,例如kafka,ES等等,Flink官方也提供了很多的connector以减少我们的开发时间。一般都是都通addSource方法添加的,这里是从文本读入,所以调用了readTextFile方法。当然我们也可以通过实现接口来自定义source。val text = env.readTextFile("/path/to/file")第三步,我们需要定义一系列的operator来对数据进行处理。我们可以调用Flink API中已经提供的算子,也可以通过实现不同的Function来实现自己的算子,这个我们会在后面讨论。这里我们只需要了解一般的程序结构即可。val counts = text.flatMap { _.toLowerCase.split("\W+") filter { .nonEmpty } } .map { (, 1) } .groupBy(0) .sum(1)[点击并拖拽以移动]上面的就是先对输入的数据进行分割,然后转换成(word,count)这样的Tuple,接着通过第一个字段进行分组,最后sum第二个字段进行聚合。第四步,数据处理完成之后,我们还要为它指定数据的存储。我们可以从外部系统导入数据,亦可以将处理完的数据导入到外部系统,这个过程称为Sink。同Connector类似,Flink官方提供了很多的Sink供用户使用,用户也可以通过实现接口自定义Sink。counts.writeAsCsv(outputPath, "\n", " ")[点击并拖拽以移动] 小结:以上,通过简单的介绍来了解Flink中的一些基本概念及编程方式。后面会对每个细节进行更为详尽地分析。Flink(一)-基本概念转自知乎:cid:link_0
  • [基础组件] MRS 812 FlinkSQL代码对接集群组件
    背景:为了方便业务开发人员开发FlinkSQL作业、提交FlinkJar作业,以及方便运维人员对Flink作业的管理,FusionInsight MRS研发了FlinkServer可视化开发平台。由于版本迭代问题,MRS812暂不支持FlinkSQL对接Elasticsearch以及客户端提交flinksql任务,MRS 820版本的FlinkSQL也跟MRS812版本的不尽相同, 某局点反馈 公网maven仓库缺少flink sql 对接 es、hbase、kafka、redis、hbase的包,以及flink sql对接这些组件的相关开发指导。当前MRS 820版本的相关包3月中旬已经提供上外部maven仓库,后续过来MRS 812版本也需要flinksql 对接这些组件。为了整合812版本中Flink配置问题以及支持对接组件问题,现给出flink-sql-demo进行客户端代码提交。在flink开发技术支持下 ,整理的MRS 812 flinksql对接es、hbase、kafka、redis、hbase 组件样例,具体对接指导见文件。介绍:该代码适用于flinksql对接es、hbase、kafka、redis、hbase 各组件,对应的sql在config目录下,执行对应组件需修改目录sqlPath的对应sql文件。public class FlinkSQLExecutor { public static void main(String[] args) throws IOException { System.out.println("-------------------- begin init ----------------------"); final String sqlPath = ParameterTool.fromArgs(args).get("sql", "config/redisSink.sql"); final StreamExecutionEnvironment streamEnv = StreamExecutionEnvironment.getExecutionEnvironment(); EnvironmentSettings bsSettings = EnvironmentSettings.newInstance().inStreamingMode().build(); StreamTableEnvironment tableEnv = StreamTableEnvironment.create(streamEnv, bsSettings); StatementSet statementSet = tableEnv.createStatementSet(); String sqlStr = FileUtils.readFileToString(FileUtils.getFile(sqlPath), "utf-8"); String[] sqlArr = sqlStr.split(";"); for (String sql : sqlArr) { sql = sql.trim(); if (sql.toLowerCase(Locale.ROOT).startsWith("create")) { System.out.println("----------------------------------------------\nexecuteSql=\n" + sql); tableEnv.executeSql(sql); } else if (sql.toLowerCase(Locale.ROOT).startsWith("insert")) { System.out.println("----------------------------------------------\ninsert=\n" + sql); statementSet.addInsertSql(sql); } } System.out.println("---------------------- begin exec sql --------------------------"); statementSet.execute(); }}操作步骤:  在windows环境中执行打包,获取Target目录下的 flink-sql-8.1.2-312005.jar,上传在/opt/client/Flink/flink/下,所有使用sql均上传到/opt/client/Flink/flink/config/目录下。在linux客户端配置flink客户端,并将服务端flink的lib目录补充包,上传测试jar和sql文件。启动yarn-session.sh,修改sql文件为本集群的IP,执行 flink-sql-8.1.2-312005.jar验证代码执行flink sql。具体详细步骤见附件。
  • [问题求助] Flink 连接 Es Kerberos认证问题
    使用Flink 向 Es写数据,在Sink 端如何做 Kerberos 认证 ?
  • Flink经典维护案例集合
    Flink全部案例集合见维护宝典:https://support.huawei.com/hedex/hdx.do?docid=EDOC1100222546&lang=zh&idPath=22658044|22662728|22666212|22396131  (FusionInsight HD&MRS租户面集群故障案例(6.5.X-8.X)->维护故障类->Flink->常见故障)经典案例分类序号案例出现频次现网经典案例1.1关于table.exec.state.ttl参数的生效机制★★客户端安装常见问题2.1Flink客户端配置方法(维护宝典:维护类故障(维护类故障(6.5.X-8.X)>Flink>FAQ>下载并配置Flink客户端)★★★★★2.2升级后任务提交失败,报zk节点权限不足(维护宝典:维护类故障(6.5.X-8.X)>Flink>任务提交常见故障>FusionInsight HD大版本从6.5.1升级到8.x版本后任务提交失败)★★★★★2.3开启ssl后,正确的提交方式(维护宝典:维护类故障(6.5.X-8.X)>Flink>任务提交常见故障>创建Flink集群时执行yarn-session.sh命令失败)★★★★★ 
  • [最佳实践] 关于table.exec.state.ttl参数的生效机制
    Flink的table.exec.state.ttl参数说明:Flink SQL 新手有可能犯的错误,其中之一就是忘记设置空闲状态保留时间导致状态爆 炸。列举两个场景:➢ FlinkSQL 的 regular join(inner、left、right),左右表的数据都会一直保存在 状态里,不会清理!要么设置 TTL,要么使用 FlinkSQL 的 interval join。➢ 使用 Top-N 语法进行去重,重复数据的出现一般都位于特定区间内(例如一小时 或一天内),过了这段时间之后,对应的状态就不再需要了。Flink SQL 可以指定空闲状态(即未更新的状态)被保留的最小时间,当状态中某个 key 对应的状态未更新的时间达到阈值时,该条状态被自动清理:基于811版本测试,测试SQL的代码如下: EnvironmentSettings.Builder builder = EnvironmentSettings.newInstance();         builder.inStreamingMode();         builder.useBlinkPlanner();         EnvironmentSettings settings = builder.build();         StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); //        env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);          StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, settings);          env.enableCheckpointing(6000L);       env.setStateBackend(new RocksDBStateBackend("hdfs:///xxxx));//启用hdfs状态后端       env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);       env.getCheckpointConfig().setCheckpointTimeout(600000L);       env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);       env.getCheckpointConfig().setFailOnCheckpointingErrors(false);       env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);          Configuration configuration = tableEnv.getConfig().getConfiguration();         configuration.setString("table.exec.state.ttl","10000");          String sqlTable_1 =                 "CREATE TABLE source1 (\n" +                 "  name varchar(10),\n" +                 "  vaa varchar(10),\n" +                 "  ts AS PROCTIME()\n" +                 ") WITH (\n" +                 "  'connector' = 'kafka',\n" +                 "  'topic' = 'user_source1',\n" +                 "  'properties.bootstrap.servers' = 'kafka:21005',\n" +                 "  'properties.group.id' = 'testGroup',\n" +                 "  'scan.startup.mode' = 'latest-offset',\n" +                 "  'format' = 'csv'\n" +                 ")";          String SqlTable_2 = "" +                 "CREATE TABLE source2 (\n" +                 "  name varchar(10),\n" +                 "  vaa varchar(10),\n" +                 "  ts AS PROCTIME()\n" +                 ") WITH (\n" +                 "  'connector' = 'kafka',\n" +                 "  'topic' = 'user_source2',\n" +                 "  'properties.bootstrap.servers' = 'kafka:21005',\n" +                 "  'properties.group.id' = 'testGroup',\n" +                 "  'scan.startup.mode' = 'latest-offset',\n" +                 "  'format' = 'csv'\n" +                 ")";          String inputTable="create table p (\n" +                 "  name1 varchar(10),\n" +                 "  name2 varchar(10),\n" +                 "  vaa1 varchar(10),\n" +                 "  vaa2 varchar(10)\n" +                 ") with ('connector' = 'print')";          String sql = "insert into\n" +                 "  p\n" +                 "select\n" +                 "  source1.name,\n" +                 "  source2.name,\n" +                 "  source1.vaa,\n" +                 "  source2.vaa\n" +                 "FROM\n" +                 "  source1\n" +                 "  join source2 on source1.name = source2.name";          //创建表1         tableEnv.executeSql(sqlTable_1);         //创建表2         tableEnv.executeSql(SqlTable_2);         //创建输出表         tableEnv.executeSql(inputTable);         //执行结果         tableEnv.executeSql(sql); 执行sql的代码片段如下:在这个用例中我们使用的TTL时间参数为10s失效。当同时输入:topic:user_source1 的数据 zs,aaa   topic:user_source2的数据zs,bbb 结果如下:但是输入:topic:user_source1 的数据ls,aaa 间隔大于10s后输入 topic:user_source2的数据ls,bbb未出现结果:从测试结果来看:(1)常规联接是最通用的联接类型,其中任何新记录或对联接任一侧的更改都是可见的,并且会影响整个联接结果。例如左边有一条新记录,当product id 相等时,它会与右边所有以前和以后的记录合并。SELECT * FROM OrdersINNER JOIN ProductON Orders.productId = Product.id对于流式查询,常规连接的语法是最灵活的,并且允许任何类型的更新(插入、更新、删除)输入表。但是,此操作具有重要的操作含义:它需要将连接输入的两侧永远保持在 Flink 状态。因此,计算查询结果所需的状态可能会无限增长,具体取决于所有输入表和中间连接结果的不同输入行的数量。(2)如果设置了TTL的时间后,过了TTL时间后,之前的状态数据会被删除。
总条数:123 到第
上滑加载中